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Response  surface  methodology  (RSM)  involves  techniques  for  experimental 

design,  statistical  modeling,  and  optimization.  This  collection  of  techniques  has 

been  used  successfully  for  the  development,  improvement,  and  optimization  of 

processes.    RSM  was  developed  under  the  assumptions  of  a  linear  model  and 

normally  distributed  response.   However,  in  many  modern  applications,  such  as 

drug  development  and  quality  assurance,  the  response  is  of  a  discrete  nature.  The 

response  could  also  be  multivariate  in  nature  consisting  of  a  number  of  observed 

outcomes  for  an  experimental  unit.  Traditional  RSM  techniques  are  inappropriate  in 

such  situations  since  the  basic  assumptions  are  not  quite  valid.  A  more  appropriate 

analysis  of  such  situations  would  involve  the  use  of  generalized  linear  models  or 

multiresponse  modeling  techniques. 

vii 


Designs  for  fitting  a  generalized  linear  model  depend  on  the  unknown 
parameters  of  the  model.  The  use  of  any  design  optimality  criterion  would  therefore 
require  some  prior  knowledge  of  the  parameters.  A  graphical  technique  is  proposed 
for  comparing  and  evaluating  designs  for  a  logistic  regression  model.  Quantiles  of 
the  mean-squared  error  of  prediction  are  obtained  on  concentric  surfaces  inside  a 
region  of  interest,  R.  For  a  given  design,  these  quantiles  depend  on  the  model's 
parameters.  Plots  of  the  maxima  and  minima  of  the  quantiles,  over  a  subset  of  the 
parameter  space,  produce  the  so-called  quantile  dispersion  graphs  (QDGs).  The 
plots  provide  a  comprehensive  assessment  of  the  overall  prediction  capability  of  the 
design  within  the  region  R.  They  also  depict  the  dependence  of  the  design  on  the 
model's  parameters.  The  QDGs  can  therefore  be  conveniently  used  to  compare 
several  candidate  designs.  Sequential  generation  of  optimal  designs  and  an  extension 
of  the  QDGs  to  the  scaled  mean-squared  error  of  prediction  are  also  considered. 

In  a  multiresponse  experiment  the  researcher  may  be  interested  in  modeling 
a  function  of  the  responses.  A  method  is  presented  for  the  modeling  of  a  function  of 
multiple  responses.  The  proposed  procedure  is  based  upon  a  multiresponse  analysis 
of  the  data  and  Taylor's  series  approximation  of  the  function  of  interest.   The 
proposed  technique  compares  favorably  with  a  standard  "naive"  procedure. 
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CHAPTER  1 
INTRODUCTION 

In  the  twenty-first  century,  statistical  data  analysis  will  continue  to  grow  in 
application  and  importance.  Response  surface  methodology  (RSM)  with  its  basic 
goals  of  experimental  design  and  modeling  with  the  purpose  of  process  improvement 
will  also  continue  to  develop  in  the  present  century.   Myers  (1999)  emphasizes 
that  two  areas  into  which  RSM  is  moving  are  generalized  linear  models  (GLMs) 
and  multiple  responses.   The  applications  using  GLMs  include  the  modeling  of 
binary  response  data  and  Poisson  count  data.  While  multiple  responses  are  often 
encountered  in  such  situations  as  the  yield  and  profit  of  a  chemical  process  or  the 
assessment  of  the  overall  quality  of  a  manufactured  product. 

In  the  area  of  GLMs,  the  main  issue  confronting  RSM  is  experimental  design. 
There  is  software  available  in  many  of  the  statistical  packages  for  the  analysis  of 
GLMs.  Unfortunately,  however,  none  is  available,  as  far  as  we  know,  for  constructing 
designs  for  GLMs.  Myers  (1999,  p.  37)  stresses  that  new  and  creative  approaches  are 
needed  to  deal  with  the  design  issues  encountered  in  GLMs.  The  standard  response 
surface  designs  and  optimal  designs  were  developed  under  traditional  assumptions 
concerning  the  model  and  response.  Designs  for  GLMs  encounter  the  problem  of 
dependence  on  the  unknown  parameters.   The  objective  of  an  experiment  is  to 
provide  estimates  of  the  parameters;  therefore,  the  researcher  is  faced  with  a  serious 
dilemma.  The  researcher  is  required  to  provide  guessed  values  for  the  parameters  or 
employ  a  sequential  approach  to  the  experiment.  Methods  are  needed  to  assess  the 
sensitivity  of  GLMs  designs  to  the  unknown  parameters. 


Multiresponse  data  are  now  also  very  common  with  much  of  the  research  and 
analysis  focusing  on  optimization.  The  analysis  of  multiresponse  data  should  give 
consideration  to  the  correlation  structure  among  the  responses.  Khuri  and  Cornell 
(1996,  chapter  7)  consider  not  only  the  optimization  of  multiresponse  data,  but  the 
modeling,  choice  of  design,  and  lack  of  fit  testing  as  well.  In  certain  multiresponse 
experiments,  the  responses  may  be  further  used  in  the  computation  of  an  outcome 
of  interest  to  the  experimenter.  Basically,  the  experimenter  is  interested  in  modeling 
a  function  of  correlated  responses.  Methods  that  recognize  the  multivariate  nature 
of  the  data  are  needed  in  this  area. 

In  this  dissertation,  a  graphical  technique  for  the  comparison  of  designs  for 
logistic  regression  models  is  developed  and  a  modeling  procedure  for  the  function  of 
multiple  responses  is  proposed.  A  review  of  the  literature  concerning  GLMs  design 
and  multiresponse  modeling  is  contained  in  chapter  2.  This  chapter  also  gives  more 
details  concerning  the  areas  to  be  investigated  in  this  dissertation.   Chapters  3 
and  4  concern  the  development  of  the  so-called  quantile  dispersion  graphs  for  the 
comparison  of  designs  for  logistic  regression  models.  In  chapter  5,  a  new  method 
for  the  modeling  of  functions  of  responses  is  proposed  which  takes  into  account  the 
correlation  structure  among  the  responses.  Finally,  a  summary  and  a  list  of  future 
research  topics  are  included  in  chapter  6. 


CHAPTER  2 
LITERATURE  REVIEW 

2.1     Introduction 

Extensive  work  in  the  area  of  single-response  modeling  has  been  carried  out 
in  the  statistical  literature  since  the  1950's.  Response  surface  methodology  (RSM), 
as  developed  in  the  early  1950's  by  George  Box  and  several  other  researchers  (Box 
and  Wilson  1951),  mainly  considers  situations  involving  a  single-response  with 
independent  and  normally  distributed  errors.  Available  textbooks  on  RSM  include 
Box  and  Draper  (1987),  Khuri  and  Cornell  (1996),  and  Myers  and  Montgomery 
(1995). 

In  many  experimental  situations,  the  traditional  error  assumptions  are 
inappropriate.  The  generalized  linear  models  (GLMs)  approach  (McCullagh  and 
Nelder  1989)  provides  an  extension  of  linear  models  in  a  manner  that  allows  less 
stringent  error  and  model  assumptions.   Khuri  (1993)  provided  a  discussion  of 
the  use  of  RSM  within  the  framework  of  GLMs.   He  noted  that  optimal  design 
theory  based  on  traditional  RSM  was  no  longer  appropriate  in  the  GLMs  context. 
Sequential  design  and  Bayesian  design  are  approaches  employed  in  the  design  of 
experiments  for  GLMs. 

In  other  experimental  situations,  however,  several  responses  are  often  of 
interest.   For  example,  in  the  semiconductor  industry,  the  number  of  particles 
remaining  and  the  oxide  removal  rate  for  a  standard  wafer  cleaning  process  are  of 
interest  in  the  manufacturing  process.  Consider  also,  the  assessment  of  pencil  lead 
which  often  uses  the  following:  ash  test,  porosity  test,  strength  test,  and  outside 


diameter.  Each  of  these  examples  has  multiple  responses  that  may  be  correlated. 
Multiple  responses  are  also  encountered  through  the  modeling  of  a  response's  mean 
and  variance,  often  analyzed  as  a  dual  response. 

The  modeling  of  several  responses  should  take  into  consideration  the 
multivariate  nature  of  the  data.  A  multivariate  analysis  can  be  more  informative  and 
more  appropriate  than  a  collection  of  univariate  analyses.  Unfortunately,  however, 
multiresponse  modeling  tends  to  be  ignored  because  of  its  complexity.   The  use 
of  linear  models  and  analysis  of  variance  has  been  extended  to  modeling  multiple 
responses  (Anderson  1984,  chap.  8).  The  classical  multivariate  analysis  of  variance 
(MANOVA)  modeling  can  be  restrictive  in  that  each  of  the  responses  must  have  the 
same  linear  relationship  with  the  control  variables  and  the  same  design.  Multiple 
design  multivariate  regression  models  are  more  general  as  they  allow  the  responses 
to  have  different  relationships  with  the  control  variables  as  well  as  different  designs 
(McDonald  1975).   Khuri  (1996)  presented  a  thorough  review  of  multiresponse 
surface  methodology. 

2.2     Designs  for  Generalized  Linear  Models 
2.2.1     Review  of  Linear  Model  Design 

Developments  in  optimal  experimental  design  have  largely  been  concerned 
with  linear  models;  for  example,  see  the  books  by  Fedorov  (1972),  Silvey  (1980), 
Pukelsheim  (1993),  and  Atkinson  and  Donev  (1992).   From  a  RSM  perspective, 
the  books  by  Box  and  Draper  (1987),  Khuri  and  Cornell  (1996),  and  Myers  and 
Montgomery  (1995)  mainly  discuss  designs  for  linear  response  surface  models  with 
the  traditional  error  assumptions. 

In  the  linear  model  situation,  attention  is  primarily  focused  on  the  properties 
of  the  variance-covariance  matrix  for  the  estimates  of  the  model's  unknown 
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parameters.  Silvey  (1980,  p.  3)  presents  Fisher's  information  matrix  as  a  motivation 
for  this  approach.  In  this  case,  the  variance-covariance  matrix  of  the  estimates  is 
a2[X'X}-\ 

Traditional  design  criteria  focus  upon  properties  of  the  matrix  X'X  which 
is  independent  of  the  model's  parameters.  For  example,  the  D-optimality  criterion 
calls  for  the  maximization  of  the  determinant  of  X'X.  While  in  G-optimality,  the 
criterion  is  to  minimize  the  maximum  prediction  variance,  which  depends  upon 
X'X,  over  a  design  region. 

Kiefer  (1985)  extended  the  notion  of  a  fixed  n-point  design  to  a  probability 
measure  over  the  experimental  region,  termed  a  design  measure.   Silvey  (1980, 
pp.    13-14)  noted  that  the  use  of  design  measures  allowed  the  exploitation  of 
mathematical  techniques  to  solve  the  optimal  design  problem  resulting  in  some 
elegant  and  useful  theory.  The  most  notable  result  is  probably  the  Equivalence 
Theorem  of  Kiefer  and  Wolfowitz  (1960)  which  establishes  the  equivalence  of 
D-optimality  and  G-optimality  for  design  measures.  Further  discussions  of  design 
measures  and  Equivalence  Theorem  are  presented  in  Chapter  4. 

2.2.2     Generalized  Linear  Model  Design 

The  generalized  linear  models  (GLMs)  approach  allows  for  less  restrictive 
error  and  model  assumptions.  For  example,  normally  distributed  errors  and  constant 
variance  properties  are  no  longer  required.   The  text  by  McCullagh  and  Nelder 
(1989)  presented  a  thorough  account  of  the  theory  of  GLMs.  Khuri  (1993)  provided 
a  discussion  of  the  use  of  the  GLMs  approach  in  RSM.  Atkinson  and  Haines  (1996) 
provide  a  recent  review  of  the  design  theory  for  nonlinear  and  generalized  linear 
models. 


In  the  GLMs  situation  the  response,  y,  is  assumed  to  follow  a  distribution 
from  the  exponential  family.   This  includes  the  normal  as  well  as  the  binomial, 
Poisson,  and  gamma  distributions.   The  mean  of  the  response  is  modeled  as  a 
function  of  the  form,  E[y{x)\  =  n(x)  =  h[f'(x)(3]  ,  where  f(x)  is  a  vector  of  order 
p  x  1  whose  elements  consist  of  powers  and  products  of  powers  of  the  elements  of 
x  up  to  degree  d(>  1),  and  /3  is  a  vector  of  p  unknown  parameters.  The  function 
f(x)0  is  called  the  linear  predictor.  It  is  assumed  that  h(.)  is  a  strictly  monotone 
function.  Using  the  inverse  of  the  function  h(.)  we  have  g[fi(x)}  =  f'(x)/3.  The 
function  g(.)  is  called  the  link  function.  In  traditional  linear  models,  the  link  function 
is  the  identity  function.  The  method  of  maximum  likelihood  (ML)  is  utilized  to 
estimate  /3.  It  follows  that  an  approximation  for  the  variance-covariance  matrix  of 
the  ML  estimates  of  /3  is  the  inverse  of  Fisher's  information  matrix  which  is  p  x  p 
with  (i,  j)th  element  E[—    ^  '   ]  where  I  is  the  log-likelihood  function.   Iterative 
reweighted  least  squares  (IRWLS)  can  be  used  to  fit  a  generalized  linear  model  as 
discussed  in  Khuri  (1993)  and  McCullagh  and  Nelder  (1989). 

Let  z(x)  =  g[y(x)].  Using  the  first-order  terms  in  a  Taylor  series  expansion, 
z(x)  w  g\ji(x)]  +  [y(x)  -  n(x)]gf[p(x)].  Thus  E[z(x)]  m  g[fi{x)]  =  f'{x)0  and 
Vai[z(x)]  w  Var[y(x)](g'(tJi{x))2  =  w{x,0).  Using  IRWLS,  Var[£]  «  [X'WX]'1, 
where  (3  is  the  ML  estimate  of  0,  X  is  a  matrix  whose  wth  row  is  f'(xu),  and  W 
is  a  diagonal  matrix  whose  «th  diagonal  element  is        x  a  ,  where  xu  is  the  vector 

w{XUip) 

of  design  settings  at  the  ttth  experimental  run.   It  follows  that  the  information 

n 

matrix,  X'WX  —  V  — -J— ar/(xu)/'(xu),  depends  on  the  unknown  parameters, 
/3,  through  w{xu,f3).  Let  z(x)  =  f'{x)J3  denote  the  predicted  value  of  z  at  a  point 
x  in  the  experimental  region. 

In  the  GLMs  situation,  as  in  traditional  design  theory,  the  focus  has  been 
on  the  variance-covariance  matrix,  (X'WX)'1,  of  the  parameter  estimates,  where 


the  matrix  W  depends  upon  the  unknown  model  parameters.  A  D-optimal  design 
maximizes  the  determinant  of  X'WX;  however,  there  is  no  single  D-optimal  design 
because  of  the  depenency  on  the  unknown  parameters.   G-optimal  designs  are 
obtained  by  minimizing  the  maximum  prediction  variance  over  the  design  region, 
where  the  prediction  variance  is  Var[z(ae)]  w  f'(x)[X'WX]~1f(x). 

The  dependence  upon  the  unknown  parameters,  /3,  encountered  in  GLMs 
also  occurs  in  the  consideration  of  designs  for  nonlinear  regression  models.  Common 
approaches  for  dealing  with  the  dependency  problem  include  using  a  "good"  guess  of 
the  unknown  parameters,  a  sequential  approach  in  which  a  design  is  determined  in 
at  least  two  stages,  and  a  Bayesian  approach  which  introduces  a  prior  distribution 
on  the  parameters.  White  (1973)  provided  an  extension  of  Kiefer  and  Wolfowitz 
(1960)  Equivalence  Theorem  to  nonlinear  models.  Chapter  4  contains  a  review  of 
this  extension  to  the  Equivalence  Theorem. 

Developments  in  designs  for  GLMs  experiments  have  been  mostly 
concerned  with  binary  responses  commonly  modeled  using  the  logistic  regression 
model (Abdelbasit  and  Plackett  1983;  Minkin  1987;  Chaloner  and  Larntz  1989; 
Myers,  Myers,  and  Carter  1994).  Abdelbasit  and  Plackett  (1983)  discuss  designs 
for  experiments  dealing  with  binary  data,  specifically  biological  assay  or  fatigue 
experiments.  An  experiment  consists  of  subjects  receiving  a  dose  level  (stimulus), 
x,  to  which  the  subject  may  respond.  Interest  is  in  the  probability  of  a  response, 
7r(x).  D-optimal  designs  for  the  logistic  model,  logit  7r(a;)  =  log ^7 L  =  (3{x  —  /x), 
are  found  assuming  two-point  symmetric  designs  about  \l  and  equal  allocations.  The 
optimal  design  is  to  use  the  dose  levels  corresponding  to  ED176  and  ED82.4,  where 
EDioop  represents  the  value  of  x  which  produces  a  100p%  response  rate.  Note  that 
these  dose  levels  are  unknown  since  the  parameters  are  unknown.  Abdelbasit  and 
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Plackett  suggest  dividing  the  experiment  into  stages  and  using  improved  estimates 
after  each  stage  in  determining  the  dose  levels. 

Minkin  (1987)  considers  the  selection  of  a  design  for  logistic  regression  such 
that  the  likelihood-based  confidence  region  is  of  minimum  volume.  Minkin  notes  that 
maximization  of  the  determinant  of  Fisher's  information  matrix  is  in  effect  equivalent 
to  minimizing  the  volume  of  a  confidence  region  based  on  a  second-order  Taylor's 
approximation  of  the  log-likelihood  function.  This  work  generalizes  the  results  of 
Abdelbasit  and  Plackett  (1983)  that  designs  with  an  even  number  of  observations 
should  be  two  point  symmetric  designs  about  ED50  with  equal  allocation  at  EDi7.6 
and  ED82.4.  For  an  odd  number  of  observations,  N,  no  fewer  than  ^   ~  '  points 
should  be  located  at  each  of  the  two  dose  levels.  Again,  a  sequential  approach  to 
selecting  the  design  is  recommended.  Further  discussion  of  sequential  design  and 
analysis  is  given  by  Wu  (1985a, b)  and  Chaudhuri  and  Mykland  (1993). 

An  alternative  approach  to  using  a  "good"  guess  or  a  sequential  design  is 
to  utilize  a  Bayesian  methodology  by  placing  a  prior  distribution  on  the  unknown 
parameters.  A  thorough  review  of  Bayesian  experimental  design  is  given  by  Chaloner 
and  Verdinelli  (1995).  Chaloner  and  Larntz  (1989)  apply  the  Bayesian  design  theory 
to  logistic  regression  experiments.  Designs  were  found  using  a  range  of  uniform  and 
independent  prior  distributions  for  \i  and  (3.  It  was  concluded  that  the  number  of 
support  points  increased  with  the  uncertainty  in  the  prior  distribution. 

Myers,  Myers,  and  Carter  (1994)  investigated  optimal  designs  for  the  logistic 
model  by  considering  the  prediction  of  a  response.  The  prediction  variance  of  logit 
7T  is  considered.  The  authors  considered  general  designs  that  minimize  the  average 
prediction  variance,  the  so-called  Q-optimality,  and  designs  centered  around  ED5o 
that  minimize  the  maximum  prediction  variance,  namely  G-optimality.  Designs  were 
found  using  the  Nelder  and  Mead  (1965)  simplex  search  procedure.  The  authors 
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considered  the  robustness  of  the  designs  to  poor  initial  guesses  through  efficiencies. 
They  concluded  that  Q-optimal  designs  with  two/three  points  were  fairly  robust  to 
poor  initial  guesses.  Furthermore,  underestimating  the  slope,  0,  resulted  in  a  higher 
Q-efficiencies,  while  improved  estimates  of  ED50  {fi)  reduced  the  effect  of  a  poor 
slope  estimate. 

2.2.3     Graphical  Tools  for  Evaluating  Experimental  Designs 

Until  recently  attempts  to  compare  designs  have  been  made  using  single- 
valued  criteria,  such  as  D-optimality  and  G-optimality.  However,  the  success  of 
a  design  for  a  given  model  is  determined  by  its  ability  to  predict  the  response 
efficiently.  The  prediction  variance  function  can  be  utilized  to  determine  prediction 
capability.   G-optimality  addresses  the  prediction  variance  but  fails  to  provide 
information  about  adequacy  of  prediction  at  various  points  in  the  region  of  interest. 
Advances  in  computing  and  graphical  methods  have  recently  resulted  in  effective 
graphical  techniques  to  evaluate  designs  (Giovannitti-Jensen  and  Myers  1989; 
Myers,  Vining,  Giovannitti-Jensen,  and  Myers  1992;  Khuri,  Kim,  and  Um  1996). 
The  variance  dispersion  graphs  (VDGs)  presented  by  Giovannitti-Jensen  and  Myers 
(1989)  are  two  dimensional  plots  of  the  maximum,  minimum,  and  spherical  average 
prediction  variances  over  concentric  spheres  inside  a  region  of  interest  against  their 
radii.  Khuri,  Kim,  and  Um  (1996)  suggest  that  quantile  plots  of  the  prediction 
variance  on  various  spheres  inside  a  region  of  interest  allow  for  a  more  comprehensive 
and  accurate  assessment  of  the  prediction  capability  of  a  design. 

In  the  nonlinear  and  GLMs  situation,  the  VDGs  and  quantile  plots  encounter 
the  difficulty  of  dependence  upon  the  unknown  parameters  as  well  as  biased 
estimates.   Khuri  and  Lee  (1996)  present  a  graphical  technique  extending  the 
quantile  plots  to  nonlinear  models.    For  evaluating  the  prediction  capability 
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of  a  design,  Khuri  and  Lee  suggest  plotting  quantiles  of  the  estimated  scaled 
mean- squared  error  of  prediction,  ESMSEP,  on  concentric  surfaces  within  the  region 
of  interest.  The  mean-squared  error  is  utilized  to  address  the  bias  in  the  estimates 
and  is  put  on  a  "per  observation"  basis.  The  mean-squared  error  is  estimated  using 
the  fitted  model  because  of  the  dependence  on  the  model's  unknown  parameters. 
In  order  to  compare  designs,  Khuri  and  Lee  propose  the  so-called  minimum 
and  maximum  SMSEP  plots,  where  SMSEP  is  the  scaled  mean-squared  error  of 
prediction.  Nonlinear  models  with  only  one  control  variable  were  considered.  The 
minimum  and  maximum  SMSEP  are  found  over  a  subset  of  the  parameter  space  for 
each  design  and  are  plotted  against  x,  where  x  is  the  control  variable.  An  example 
was  presented  to  demonstrate  that  designs  based  on  a  single-valued  criterion  may 
perform  poorly  in  terms  of  the  mean-squared  error  of  prediction. 

The  development  of  quantile  dispersion  graphs  (QDGs)  for  logistic  regression 
models  is  presented  in  chapter  3.  Chapter  4  contains  details  about  using  the  QDGs 
for  the  comparison  of  traditional  RSM  designs  as  well  as  issues  concerning  the 
sequential  generation  of  optimal  designs. 

2.3     Review  of  Multiresponse  Modeling 
2.3.1     General  Multiresponse  Model 

Suppose  that  an  experimenter  is  interested  in  modeling  r  responses  using 
k  control  variables.  An  experiment  is  performed  where  the  responses  and  control 
variables  are  measured  on  each  of  N  experimental  runs. 

Multiresponse  models  allow  each  of  the  responses  to  have  a  different 
relationship  with  the  control  variables.  Each  response  is  represented  by  a  model 
function  involving  the  control  variables  and  a  set  of  unknown  parameter  values.  The 
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multivariate  nature  of  the  data  is  accounted  for  by  taking  into  consideration  the 
variance-covariance  structure  of  the  r  responses. 

At  the  uth  run,  let  yu  =  (yui,  yu2,  yU3,  •  •  •  ,  yUr)'  be  the  vector  of  r  responses, 
and  xu  =  (xui,  xu2,  xu3, . . .  ,  xuq)'  be  the  vector  of  q  control  variables.  The  model  for 
the  zth  response  is  of  the  form: 

yUi  =  fi(xu,/3)  +  eui   u  =  l,2,...,N  i=l,2,...,r  (2.1) 

where  /3  =  (/?i ,  /?2 ,  •  •  •  ,Pp)'  is  vector  of  p  (unknown)  parameters,  eui  is  a  random 
error,  and  the  /*  model  function  for  the  zth  response  is  of  known  form. 

The  standard  assumptions  about  the  random  errors  are:  E(eUJ)  =  0  all  u, 
i  and  E(eui€u>ii)  =  a^  if  u  =  u',  zero  otherwise.  Thus,  random  errors  (among  the 
r  responses)  from  the  same  experimental  run  are  assumed  to  be  correlated,  but 
random  errors  from  different  experimental  runs  are  uncorrelated. 

Assuming  that  the  vector  of  random  errors  for  the  uth.  run,  eu  = 
(eui,eu2,eu3)  ■  ■  •  i^-ut)'-,  has  a  multivariate  normal  distribution,  Box  and  Draper 
(1965)  utilized  the  Bayesian  methodology  to  estimate  /3.  The  likelihood  function 
for  the  N  x  r  response  matrix  Y,  where  the  zth  column  of  Y  is  the  vector  of  iV 
observations  on  the  zth  response  (i  =  1,2, . . .  ,r),  is  given  by 

L{y  I*  E)  =  ^Nrl^N^M-ltvace^Z'i^Zm)  (2.2) 

where  S  is  the  variance-covariance  matrix  (aij)  for  the  r  responses,  Z(/3)  =  Y  —  F((3), 
and  F((3)  is  an  TV  x  r  matrix  whose  («,i)th  element  is  fi(xu,/3).    Assuming 
noninformative  prior  distributions  for  0  and  S,  the  marginal  posterior  density  for  /3 
is 

P{fi\Y)  oc  \*ifi)Z(fi)\-\*  (2.3) 
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Estimation  of  j3  is  achieved  by  maximizing  this  posterior  density,  which  is  equivalent 
to  minimizing  the  determinant 

A(/9)  =  |Z'(/3)Z(/3)|  (2.4) 

with  respect  to  (3.   This  estimation  procedure  is  referred  to  as  the  Box-Draper 
determinant  criterion.  Bates  and  Watts  (1988,  pp.  138-139)  noted  that  this  Bayesian 
argument  is  in  fact  not  needed  and  that  the  standard  likelihood  approach  leads  to 
the  same  method  of  estimation.  This  estimation  method  is  very  general  in  that  it 
places  no  restrictions  on  the  functional  relationships  used  in  the  modeling. 

2.3.2     Linear  Multiresponse  Model 

If  the  multiresponse  model  (2.1)  is  linear  in  the  parameters,  then 

fi(xu,(3)  =  f'i(xu)Pi   i  =  l,2,...,r, 

where  fi(xu)  is  a  vector  of  order  pi  x  1  whose  elements  consist  of  powers  and 
products  of  powers  of  the  elements  of  xu  up  to  degree  d*(>  1),  and  fi{  is  a  vector  of 
Pi  unknown  constant  coefficients.  In  this  case,  the  vector  f3  in  model  (2.1)  is  of  the 
form  (3  —  (/3\  :  f3'2  :  •  ■  •  :  /3'r)',  assuming  that  the  /9/a  do  not  have  common  elements. 
The  generalized  least-squares  estimator  of  f3  is  given  by  the  value  of  /3  that 
minimizes 

T  T 

£  £  j?[Vi  -  XAgfy  -  Xi0$,  (2.5) 

where  y{  =  (jfy,  y2i>  Vzu  ■  ■  ■  ,  VNi)',  i  —  1, 2  . . .  ,  r,  Xi  is  an  N  x  p;  matrix  whose  uth 
row  is  f'i{xu),  and  axi  is  the  (i,  j)th  element  of  S_1.  Since  E  is  unknown,  Zellner 
(1962)  proposed  a  procedure  for  estimating  it.  For  this  purpose,  each  response  is 
analyzed  using  ordinary  least  squares.  Consider  the  least-squares  residuals  for  the 
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z'th  response,  e,-,  i  =  1,2, . . .  ,  r.  An  estimate  of  £  is  given  by  £  =  (<7jj),  where 

°ij  =  Jj^fii   »,i  =  1.2,. . .  ,r.  (2.6) 

An  estimate  of  /3  is  obtained  by  using  £  in  place  of  £  in  criterion  (2.5).   The 
generalized  least-squares  method  is  therefore  a  two-stage  procedure  which  does  not 
require  the  normality  assumption.  Chapter  5  contains  a  more  detailed  presentation 
of  the  analysis  for  the  linear  multiresponse  model. 

Problems  such  as  dependencies  among  the  responses  and  missing  data  are 
associated  with  the  analysis  of  multiresponse  models.  Box,  Hunter,  MacGregor,  and 
Erjavec  (1973)  considered  three  kinds  of  dependencies:  dependence  among  the  errors 
which  led  to  the  Box-Draper  criterion,  linear  dependencies  among  the  expected  values 
of  the  responses  and  linear  dependencies  in  the  data.  An  eigenvalue-eigenvector 
analysis  was  proposed  to  detect  and  identify  such  dependencies.  Khuri  (1990)  noted 
that  problems  may  occur  in  the  eigenvalue  analysis  if  the  response  variables  have 
different  units  of  measurement.  He  suggested  a  scaling  convention  for  the  responses 
to  address  this  situation.   The  problem  of  missing  data  was  addressed  by  Box, 
Draper,  and  Hunter  (1970)  by  treating  the  missing  values  as  additional  parameters. 
Stewart  and  Sorensen  (1981)  considered  an  approach  using  the  posterior  density 
function  of  f3  and  £  given  the  complete  data. 

2.4    Functions  of  Multiple  Responses 
In  some  multiresponse  experiments,  the  modeling  of  a  function  (or  functions) 
of  the  response  variables  is  of  interest.  The  following  two  multiresponse  experiments 
originated  from  the  semiconductor  industry  (T.  Moore,  personal  communication, 
1996). 
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Example  1 

SEMATECH  performed  an  experiment  involving  a  silicon  nitride  deposition 
process  using  three  factors  in  an  effort  to  identify  a  suitable  process  window.  The 
three  control  variables  (factors)  were  pressure,  total  flow,  and  gas  ratio.   The 
responses  of  interest  were  thickness  (THK),  wafer  to  wafer  standard  deviation 
(WTWSD),  and  within  wafer  standard  deviation  (WIWSD).  The  main  objective 
was  the  determination  of  desired  settings  of  the  control  variables  that  would  achieve 
a  target  value  for  THK  while  minimizing  the  variation  in  the  process.  The  problem 
at  hand  is  the  modeling  of  the  total  variation  in  the  process  (in  general,  modeling 
functions  of  the  responses  measured).  In  the  present  example,  the  total  variation  is 
defined  by  the  function 


TOTSD  =  VWTWSD2  +  WIWSD2  (2.7) 

Two  methods  of  modeling  this  variable  can  be  considered.  First,  the  new  response 
TOTSD  is  modeled  along  with  the  THK  response.   Second,  the  three  responses 
WTWSD,  WIWSD,  and  THK  are  modeled  and  their  predicted  values  are  substituted 
into  TOTSD.  It  is  of  interest  to  know  which  of  these  two  methods  is  better  and 
whether  there  are  better  alternative  approaches  to  modeling  such  a  function  of 
multiple  responses. 

Example  2 

SEMATECH  performed  an  experiment  to  investigate  a  tool  used  in  the 
process  of  polishing  wafers.  Three  control  variables  (factors)  were  considered:  table 
speed,  downforce,  and  slurry  concentration.  The  measured  responses  were  removal 
rate  of  metal  (RR),  oxide  removal  rate  (OR),  and  within  wafer  standard  deviation 
(WIWSD)  for  the  removal  rate  of  metal.  Two  functions  of  the  multiple  responses 
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were  of  primary  interest. 

RR  WIWSD 

Vl~0R        V2~       RR 

Settings  of  the  factors  that  would  maximize  yx  and  minimize  y2  were  desired. 

Each  of  the  above  examples  illustrates  a  situation  in  which  the  primary 
interest  is  the  modeling  of  a  function  (or  functions)  of  several  responses.  Little  work 
has  been  done  in  this  area. 

Hunter  and  Jaworski  (1986)  describe  a  method  for  model  building  where 
the  response  of  interest,  G(yi,  2/2,  ••  •  >  Vr),  is  a  function  of  r  responses.  A  "naive" 
approach  consists  of  calculating  the  value  of  the  response  function  for  each 
experimental  run  and  then  using  single-response  techniques  to  model  this  response 
as  a  polynomial  function  of  the  control  variables.  Hunter  and  Jaworski  proposed 
to  fit  each  of  the  r  responses  yi,y2,  •  •  •  ,  Vr  individually  and  then  the  model  for  the 
response  of  interest  is  obtained  by  G(p,\,  p,2,  •  •  •  ,/ir),  where  fii  is  the  estimate  of  the 
mean  from  the  individually  fitted  model  for  the  ith  response.  Hunter  and  Jaworski 
present  an  example  where  the  response  of  interest  is  the  product  of  y\  and  2/2  •  Using 
a  22  factorial  design  with  a  center  run,  first-order  models  are  fitted  to  yx  and  y2. 
The  models  are  then  multiplied  together  to  obtain  a  second-order  model  for  y\ y2.  A 
more  detailed  discussion  of  these  two  procedures  is  presented  in  Chapter  5. 

Hunter  and  Jaworski  place  no  constraints  on  how  the  response  of  interest, 
G(yi,  2/2,  ••  •  j  Vr),  is  related  to  yx,  y2, . . .  ,  yr.  For  example,  they  do  not  discuss  the 
problems  that  may  occur  if  G(yx,  y2,  ■  ■  ■  ,  Vr)  is  an  ill-behaved  function  such  as  a 
ratio,  j£.   Also,  correlations  among  the  multiple  responses  yi,y2, . . .  ,yr  are  not 
considered. 

Guo  and  Sachs  (1993)  employ  the  naive  approach  and  the  alternative  approach 
of  Hunter  and  Jaworski  in  modeling  spatial  uniformity  in  the  semiconductor  industry. 
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Spatial  uniformity  is  denned  to  be  the  fundamental  ratio  of  the  observed  standard 
deviation  and  observed  mean  within  a  single  wafer  or  a  batch  of  wafers  (coefficient 
of  variation).  Guo  and  Sachs  incorrectly  call  the  method  of  Hunter  and  Jaworski: 
"multiple  response  surfaces  approach."   Guo  and  Sachs  describe  the  methods  as 
follows: 

In  most  cases,  spatial  uniformity  is  not  measured  directly  but  is  calculated 
from  measurements  of  output  characteristics  taken  at  specified  locations. 
It  might  be  beneficial  to  fit  these  primitive  output  characteristics  first  and 
then  manipulate  these  models  to  construct  the  model  of  the  spatial  uni- 
formity. This  is  different  from  the  traditional  approach  of  first  calculating 
the  value  of  the  spatial  uniformity  for  each  experimental  run  and  then 
fitting  an  equation  to  these  calculated  values.  (Guo  and  Sachs  1993,  p. 
42) 
Neither  approach  recognizes  the  multivariate  nature  of  the  data. 

Guo  and  Sachs  (1993)  provide  a  simple  example  to  illustrate  the  two 
approaches  of  creating  models.    An  experiment  in  the  semicondutor  wafer 
manufacturing  consisted  of  two  factors  (xi,£2)  and  three  output  characteristics 
(deposition  rates  at  three  locations  within  the  wafer).  Table  2.1  contains  the  data 
for  the  uniformity  example.   Using  the  naive  approach  of  working  with  the  last 
column  resulted  in  the  following  second-order  model: 

J(8j  =  55.92  -  1.4528x1  -  .9196x2  +  0.01094x?  +  0.00867^  +  0.00658x!X2 
The  method  proposed  by  Hunter  and  Jaworski  suggests  the  fitting  of  first-order 
models  to  each  of  the  three  output  characteristics: 

/»!  =  29.8449  +  0.2940xi  +  0.1175x2 

Aa  =  34.0764  +  0.2048xx  +  0.1378x2 
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Table  2.1:  Guo  and  Sachs  Example:  Data  for  Uniformity  Models 


Equipment 

Spatial 

Settings 

Output  Characteristics 

Uniformity 

Run 

X\ 

x2 

2/i 

2/2 

2/3 

§(%) 

1 

46 

22 

45.994 

46.296 

48.589 

3.022 

2 

56 

22 

48.843 

48.731 

49.681 

1.058 

3 

66 

22 

51.555 

50.544 

50.908 

1.004 

4 

46 

32 

47.647 

47.846 

48.519 

0.952 

5 

56 

32 

50.208 

49.930 

50.072 

0.278 

6 

66 

32 

52.931 

52.387 

51.505 

1.377 

7 

46 

42 

47.641 

49.488 

48.947 

1.950 

8 

56 

42 

51.365 

51.365 

50.642 

0.817 

9 

66 

42 

54.436 

52.985 

51.716 

2.566 

Source:  Guo  and  Sachs  (1993,  p.  43) 

fa  =  41.3942  +  0.1346X!  +  0.0355x2 
The  model  for  spatial  uniformity  is  now  obtained  by  using  the  above  first-order 
models  in  the  following  equation: 


£7qT\  _  -s/i[(ih-£)2+fe-£)2+(£3-£)2]QQQN 

where  (i  =  |(/ii  +  fi2  +  /t3) 
This  results  in  the  following  model  for  spatial  uniformity: 


5"7(y\  __   i/34. 140-0.911x1 -0.537i2+0-0064x2+0.0029a|+0.0062:E1g2),,)nrVt 
^V/°J  —  35.105+0.211xi+0.097x2  l'lUUJ 

In  other  situations,  the  function  of  interest  can  be  a  ratio,  1LL,  of  two 
'  '  2/2' 

responses,  where  E[yi]  =  fa,  Var[t/j]  =  an,  i  =  1,2,  Cov[yi,y?\  =  <ti2  and  the 
correlation  coefficient  is  p.    Fieller  (1932)  and  Hinkley  (1969)  considered  the 
distribution  of  the  ratio,  j£  when  j/1,2/2  have  a  joint  bivariate  normal  distribution. 
(See  Appendix  A  for  the  distribution  of  [£.)  This  distribution  approaches  normality 
as  xl^L  tends  to  zero.  Shanmugalingam  (1982)  noted  that  seldom  is  this  condition 
met  in  practical  situations  and  performed  simulation  studies  to  investigate  the 
distribution.  Shanmugalingam  also  noted  that  heterogeneity  and  anormality  both 
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occur  together.  This  violates  assumptions  utilized  in  the  naive  approach  of  modeling 
a  asa  single  response.  The  work  of  Fieller  and  Hinkley  addresses  the  distribution 
of  the  ratio  but  fails  to  address  any  modeling  of  the  relationship  between  the  ratio 
and  control  variables. 

A  method  for  the  modeling  of  functions  of  multiple  responses  is  proposed  in 
chapter  5.  The  proposed  method  incorporates  the  multivariate  nature  of  the  data 
and  uses  a  Taylor's  series  approximation  of  the  function  of  interest. 


CHAPTER  3 

A  GRAPHICAL  PROCEDURE  FOR  EVALUATING  AND 

COMPARING  DESIGNS  FOR  LOGISTIC  MODELS 

3.1     Introduction 

The  design  of  experiments  has  traditionally  considered  linear  models  with 

random  experimental  errors  assumed  to  be  normally  distributed  with  a  zero  mean 

and  a  constant  variance.  The  texts  by  Fedorov  (1972),  Silvey  (1980),  Pukelsheim 

(1993),  and  Atkinson  and  Donev  (1992)  give  detailed  accounts  of  optimal  design 

theory;  while  Box  and  Draper  (1987),  Myers  and  Montgomery  (1995),  and 

Khuri  and  Cornell  (1996)  detail  design  of  experiments  from  a  response  surface 

methodology  (RSM)  perspective.  Traditional  design  theory  has  focused  on  design 

criteria  that  pertain  to  the  variance-covariance  matrix  of  the  vector  of  parameter 

estimates  for  a  given  linear  model.  A  number  of  such  criteria  have  been  grouped 

together  under  the  so-called  alphabetic  optimality,  for  example,  D-optimality  and 

G-optimality.  Alphabetic  optimality  criteria  utilize  single-valued  functions  that 

measure  the  precision  of  estimation  of  the  parameters  or  of  the  mean  response,  both 

of  which  involve  the  variance-covariance  matrix  of  the  parameter  estimates.  These 

single-valued  criteria,  however,  may  be  insufficient  to  assess  the  overall  quality 

of  prediction  of  a  given  experimental  design  throughout  the  experimental  region 

(Khuri,  Kim,  and  Um  1996).  For  linear  models,  the  variance-covariance  matrix 

of  the  parameter  estimates  depends  upon  the  design  settings,  but  is  free  of  the 

unknown  parameters  of  the  model  under  consideration. 
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In  many  experimental  situations,  the  common  assumptions  concerning  the 
form  of  the  fitted  model  (e.g.  linear)  and/or  the  distribution  of  the  error  term  (e.g. 
normal  with  a  zero  mean  and  a  constant  variance)  are  violated.  For  example,  one 
of  the  most  common  ways  in  which  violations  may  occur  is  when  the  response,  in  a 
given  experiment,  is  binary.  Binary  data  are  frequently  encountered  in  dose-response 
models,  quantal  response  models,  and  success-failure  experiments.  Piegorsch  (1998) 
discusses  engineering  situations  where  the  data  are  observed  proportions  and  the 
binomial  probability  model  is  assumed.  Carteret  al.  (1986)  and  Vidmar,  McKean, 
and  Hettmansperger  (1992)  present  examples  from  drug  combination  experiments 
with  response-nonresponse  data. 

Generalized  linear  models  (GLMs)  provide  an  extension  of  linear  models  by 
relaxing  the  assumptions  of  normality  and  constant  variance  for  the  random  errors 
(McCullagh  and  Nelder  1989).  The  basic  components  of  GLMs  consist  of  a  response 
having  a  distribution  that  belongs  to  the  exponential  family,  such  as  the  normal, 
binomial,  Poisson,  and  gamma  distributions.  In  addition,  the  mean  of  the  response 
is  assumed  to  be  a  strictly  monotone  increasing  function  of  a  linear  predictor 
which  depends  on  unknown  parameters  and  several  control  variables.  For  GLMs, 
however,  the  variance-covariance  matrix  of  the  parameter  estimates,  depends  upon 
the  unknown  parameters  of  the  linear  predictor.  This  dependency  problem  causes  a 
great  difficulty  in  the  construction  of  designs  for  GLMs.  A  few  strategies  for  dealing 
with  this  dependency  problem  include  the  use  of  local  optimal  designs  based  on  a 
"good"  initial  guess  of  the  unknown  parameters,  a  sequential  approach  whereby  the 
design  is  developed  in  stages  combined  with  updated  estimates  of  the  parameters 
(Wu  1985a,b;  Chaudhuri  and  Mykland  1993).  Another  strategy  includes  the  use 
of  the  Bayesian  methodology  which  imposes  prior  distributions  on  the  parameters 
(Chaloner  and  Larntz  1989;  Chaloner  and  Verdinelli  1995). 
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Advances  in  computing  and  graphical  methods  have,  in  recent  years,  made 
it  possible  to  develop  effective  graphical  techniques  for  evaluating  designs  which 
do  not  require  using  single-valued  criteria  as  in  traditional  design  theory.  Variance 
dispersion  graphs  (Giovannitti-Jensen  and  Myers,  1989)  and  quantile  plots  of  the 
prediction  variance  (Khuri,  Kim  and  Um,  1996)  are  such  techniques  used  in  RSM 
models  to  assess  the  prediction  capability  of  a  design.  These  techniques,  however, 
still  encounter  the  same  problem  of  dependence  upon  unknown  parameters  when 
applied  to  nonlinear  models  and  GLMs.   For  the  latter  models,  the  parameter 
estimates  are  biased. 

While  investigating  the  bias  and  accuracy  of  parameter  estimates  for  the 
quantal  response  model,  Sowden  (1971,  p.  602)  commented  that  his  results  "have 
obvious  implications  for  designing  quantal  response  experiments.    A  detailed 
investigation  may  well  suggest  alternative  design  criteria."  For  nonlinear  models, 
both  bias  and  variance  functions  should  be  considered  in  the  evaluation  of 
experimental  designs.  Khuri  and  Lee  (1996)  suggested  a  graphical  approach  based 
on  the  mean  squared  error  of  prediction  for  evaluating  and  comparing  designs  for 
nonlinear  models.  The  use  of  the  mean-squared  error  of  prediction  incorporates  the 
prediction  bias  as  well  as  the  prediction  variance. 

In  this  chapter,  a  graphical  technique,  based  on  the  mean-squared  error  of 
prediction,  is  developed  for  the  evaluation  and  comparison  of  experimental  designs 
for  logistic  regression  models. 
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3.2     Prediction  Capability  of  Design 
The  prediction  capability  of  a  given  design  will  be  evaluated  using  the 
mean-squared  error  of  prediction  namely, 

MSE[£(x)]  =  Var[/}(cc)]+}Bias[/}(a;]}2,  (3.1) 

where  fi(x)  is  the  estimated  value  of  the  mean  response. 

3.2.1     Generalized  Linear  Models 

Consider  a  sample,  yu  . . .  ,yN,  of  independent  random  response  variables 
each  having  a  density  in  the  exponential  family  (Cordeiro  and  McCullagh  1991), 
that  is, 

S(Vu\  On,  <t>)  =  exp[(f){yju  -  b(8u)  +  c(yu)}  +  d(yu,  (/>)},  u  =  1,  2, . . .  ,  N.  (3.2) 

where  6(-),c(-),  and  d(-,-)  are  known  functions  and  <f>  is  a  dispersion  parameter. 
It  follows  that  E(yu)  =  ft,  =  ^  and  Var(yu)  =  f,  where  Vu  =  J*  (Cordeiro 
and  McCullagh  1991,  p.  631).  The  specification  of  a  link  function,  t)  =  g(n),  is 
required  where  g(-)  is  a  strictly  monotone  increasing  function  and  77  is  a  linear 
predictor.  Let  xu  be  a  vector  of  control  variables  associated  with  the  uih  response 
value,  yu,  u  =  1, 2, . . .  ,  N.  The  mean  response  at  a  point  jc  in  a  region  of  interest  is 
modeled  as  a  function  of  the  control  variables  of  the  form  fi(x)  =  h[rj(x)],  where  h(-) 
is  the  inverse  function  of  g(-),  the  elements  of  x  consist  of  the  control  variables,  and 
T)(x)  is  assumed  to  be  a  linear  function  of  the  form  r](x)  =  f'(x)/3.  Here,  f(x)  is 
a  vector  of  order  pxl  whose  elements  consist  of  powers  and  products  of  powers  of 
the  elements  of  x  up  to  degree  d(>  1),  and  ft  is  a  vector  of  p  unknown  parameters. 
The  method  of  maximum  likelihood  is  utilized  to  estimate  ft  implemented  through 
an  iterative  reweighted  least  squares  procedure  as  discussed  in  Khuri  (1993)  and 
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McCullagh  and  Nelder  (1989,  pp.  40-43).  Using  the  work  of  Khuri  (1993),  which 
considered  RSM  within  the  framework  of  GLMs,  an  approximate  prediction  variance 
of  fi(x),  the  estimated  value  of  the  mean  response  at  x,  is  given  by 

Var[A(*)]  =  l-(^ff'{x)[X'WX}-'f{x)  (3.3) 

where  X  is  a  matrix  whose  ttth  row  is  f(xu)  u  =  1,2, . . .  ,  N,  and  W  = 
Diag(u;i, . . .  .w^),  where  wu  =  -^ —  with  |g*  denoting  the  derivative  of  \i  with 
respect  to  rj  evaluated  at  xu.  Note  that  in  Khuri  (1993),  wu  =  (^r)2xr^h — J  = 
(f^-)2^"-  The  dispersion  parameter,  <f>,  has  been  factored  out  in  expression  (3.3). 

3.2.2     Bias  Computation 

Difficulty  in  the  computation  of  bias  of  parameter  estimates  has  hindered  the 
use  of  bias  as  a  criterion  for  the  selection  of  designs  for  nonlinear  models  as  well 
as  generalized  linear  models.  Cox  and  Snell  (1968)  and  Cox  and  Hinkley  (1974) 
developed  a  bias  approximation  formula  for  the  maximum  likelihood  estimate  of  a 
single  parameter  using  likelihood  theory  and  Taylor's  series.  Using  tensor  methods, 
McCullagh  (1987)  developed  a  multiparameter  version  of  the  bias  computation. 
Bias  results  have  been  further  developed  for  nonlinear  regression  models  with 
normal  errors  (Cook,  Tsai,  and  Wei  1986)  and  with  exponential  family  errors  (Paula 
1992).  Cordeiro  and  McCullagh  (1991)  used  the  tensor  methodology  to  develop  bias 
computation  for  generalized  linear  models.  Khuri  and  Lee  (1996)  used  Cook  et  al. 
(1986)  bias  formula  in  the  development  of  their  graphical  procedure  for  comparing 
designs  for  nonlinear  models.  Cadigan  (1994)  presented  a 

bias  computation  method  which  did  not  require  the  use  of  the  tensor 
methodology.  His  method  resulted  in  a  relatively  simple  computational  algorithm 
for  bias  approximation.  This  method  is  summarized  here. 
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Let  L  denote  the  log  likelihood  function,  where  L  —  L(/3)  is  a  function  of 
fi,  a  p  x  1  vector  of  parameters  to  be  estimated.  Let  u(/3)  be  a  p  x  1  vector  of 
first-order  partial  derivatives  of  L  with  respect  to  /3,  and  let  j((3)  be  a  p  x  p  matrix 
of  second-order  partial  derivatives  of  L  [Hessian  matrix  of  L].  Fisher's  information 
matrix  is  then  given  by  I  =  -E[j(/3)}  =  E[u(0)u'(0)}. 

Using  the  likelihood  theory  and  Taylor's  series,  Cadigan  (1994,  p.    92, 
formula  (5))  developed  the  following  bias  approximation  for  the  maximum  likelihood 
estimate,  /3,  using  a  second-order  approximation  of  the  bias: 

BiasOS)  «  rHmWr'um  +  ^[^^vec/-1}  (3.4) 

where  if  A  is  m  x  n  and  A  —  [ai, . . .  ,  on],  then  vec(A)  is  an  mn  x  1  vector  of  the 
form 


vec(A) 


a2 


A  Bias  Formula  for  Generalized  Linear  Models 

Cadigan  (1994)  applied  the  bias  formula  (3.4)  to  GLMs.  The  resulting  bias 
approximation  for  GLMs  is  given  by  Cadigan  (1994)  and  Cordeiro  and  McCullagh 
(1991,  eq.  4.2): 

Bias(^)  =  -±-{X'WX)-lX'ZdFln  (3.5) 

where  X  and  W  are  defined  as  before  (see  formula  (3.3)),  and  Zj  = 
Diag(zn,...  ,znn),  where  zuu  is  the  «th  diagonal  element  of  Z  = 
X(X'WX)-lX',F  =  Diag(/n,.../™),  where  fuu  =  J^L,  and  1„  is 
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an  JV  x  1  vector  of  ones.  £*  and  W  are  the  first  and  second  derivatives  of  n  with 
respect  to  77  evaluated  at  xu. 

Cordeiro  and  McCullagh  (1991)  further  developed  the  following  approximate 
expressions  for  the  bias  in  tj  =  XJ3  and  fi  =  h(fj),  where  /i  and  -q  are  the  vectors 
whose  wth  elements  are  fi(xu)  =  h[fj(xu)}  and  fj(xu)  =  f'(xu)/3,  respectively, 
a  =  1,2,...  ,N: 

Biasft)  =  -±-ZZdFln  (3.6) 

1(f) 

Bias(A)  =  ^r(G2  -  G1ZF)Zdln  (3.7) 

2(f> 

where  d  =  Diag(g^)  and  G2  =  Diag(^)  (Cordeiro  and  McCullagh  1991,  eqs. 
4.3  and  4.4).  Expression  (3.7)  follows  from  considering  the  bias  of  fi  which  is  a 
function  of  77.  Cordeiro  and  McCullagh  (1991)  noted  that  to  order  n~l, 

Bias[A(xu)]  =  Bias[r)(*u)]&  +  ^var^a^fe,    tt  =  1,2, . . .  ,N.         (3.8) 

Note  that  Var[7/(x)]  =  Var[/'(x)3]  ~  \f\x){X'WX)-xf{x)  and  Bias[r)(x)]  W 
/'(a;)Bias(/3)  The  expressions  in  (3.3)  and  (3.8)  can  then  be  used  to  get  the 
mean-squared  error  of  prediction  for  the  mean  response. 

Cordeiro  and  McCullagh  (1991)  also  noted  that  the  bias  of  the  maximum 
likelihood  estimates,  f3,  in  generalized  linear  models  can  be  computed  through 
a  simple  weighted  linear  regression  computation.  Letting  £  =  —  ^jW~  ZdFlN, 
Bias(/3)  in  (3.5)  can  be  written  as 

Bias(/3)  =  (X'WX)-lX'W£.  (3.9) 

The  bias  of  (3  is  then  the  vector  of  regression  coefficients  based  on  the  weighted 
linear  regression  of  £  on  X  with  W  being  the  weight  matrix. 
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3.3    Application  to  Logistic  Models 
Consider  independent  random  variables,  rx, . . .  ,  rn,  with  ru  having  a  binomial 
distribution  with  index  mu  and  probability  of  success  7ru,  (u  =  1, . . .  ,  n).  Hence, 
E(r„)  =  mu7ru  and  Var(ru)  =  m„7ru(l  -  tru).  Let  xu  be  a  vector  of  design  settings  of 
a  set  of  control  variables  associated  with  ru  (u  =  1,2,...  ,  n).  Consider  the  sample 
proportions  yu  =  ^  (it  =  1, . . .  ,n).  Then,  E(y„)  =  7TU  and  Var(y„)  =  7r"^~7r").  The 
density  for  a  sample  proportion  yu  has  therefore  the  following  form: 

exp[yumulog(- — — )  +  mulog(l  -  -ku)  +  logl       u    )].  (3.10) 

1  -  ««  \muyuJ 

This  density  is  in  the  exponential  family  (see  equation  (3.2))  with  <j>  —  1,9U  = 
mulog{^),  b(9u)  =  mJog{l  +  exp(^)),  c(yu)  =  0,  and  d(yu,  0)  =  log^J. 

Using  a  logit  link  function  with  a  linear  predictor  to  model  the 
probability  of  success,  iru,  as  a  function  of  the  control  variables,  xu,  results  in 
g[Tr(xu)]  =  logical))  ~  f'(xu)P  =  ^{vu)  =  *•  Hence,  the  mean  response  is 
E[y(xu)}  =  n{xu)  =  tt(xu)  =  ttu  =  ^expS)]- 

For  the  calculation  of  Bias(/3)  =  (X'WX^X'W^,  W  and  £  are  needed 
(see  expression  (3.9)).    For  W  =  Diag(wi,...  .wn),  where  wu  =    d^"      and 
Vu  =  ^p-.  We  have  w„  =  m„7ru(l  -  iru),  £  =  -^W~xZ,}F\n  whose  uth.  element 


:.      i    Vu    y    (~^t^t]  .       i,    (^t]  ..  _i,    exp(77u)-exp(2r?u)  ,     exp(r;u)      _ 
1S      2(^u-)2^««       Vu      "  —      2^uu  FIST  —      2^uu     (i+exp(r?„))3     /  (l+exp(T)„))2 

_i       l-expfo.,)  _  _i       /«  _  2     x  _        /      _  .  /2n  j,  n 

2,iuu  l-)-exp(r)u)    —        2    uu\  *"%)   —  ^u«  V'u        A/  */>  u  —  *!*»••■   >  «•■ 

Consider  the  mean-squared  error  of  prediction  for  ir(x)  at  the  setting, 
x,  of  the  control  variables,  MSE[7r(x)]  =  Var[7r(x)]  +  (Bias[7r(x)]}2.    Here, 
n(x)  =  l+exprtani  wnere  v(x)  —  f'(x)P-  By  formula  (3.3),  an  approximate  variance 


27 
of  ft  is  given  by 

Var(7r(x))    «    {^ffi^WX^fim) 

«    7r(x)2(l-7r(x))7'(x)(X'WX)-7(*)-  (3.H) 

For  the  prediction  bias  of  7r(x),  formula  (3.8)  leads  to  the  following  bias  expression: 


,dx(x)      1..    r„,  ..d2***) 

ir+2VarW*)]^v 


Bias(7r(x))    w    Bias[^(x)]^P  +  ^Var[7)(x)]^-Vi  (3.12) 


«    Bias[7)(x)]7r(x)(l  -  tt(x))  +  -Var[?7(x)]7r(x)[l  -  tt(x)][1  -  2?r(x)]. 

Noting  that  Varf^x)]  =  Var[/'(x)0]  n  f(*)(XWX)-l/(f)  and  Bias[»/(x)]  = 
/'(x)Bias(/3),  we  finally  get  from  formulas  (3.11)  and  (3.12)  the  following  expression 
for  the  mean-squared  error  of  prediction  at  x 

MSE(tt(x))    =    7r2(x)(l-7r(x))2/'(x)(X'WX)-7(a:)  (3-13) 

+{/'(x)Bias09)7r(x)(l-7r(x)) 
+  1-f'(x)(X'WX)-1f(x)n(x)[l  -  w(x)][l  -  2tt(x)]}2. 

It  should  be  noted  that  MSE[#(x)]  depends  upon  the  unknown  /3.  An  estimate  can 
be  obtained  by  replacing  /3  with  J3. 

3.4     Quantile  Dispersion  Graphs 
As  mentioned  earlier,  traditional  single-valued  design  criteria  may  be 
insufficient  to  fully  characterize  the  efficiency  of  an  experimental  design.  In  order 
to  assess  the  prediction  capability  of  an  experimental  design  for  linear  response 
surface  models,  Giovannitti-Jensen  and  Myers  (1989)  suggested  the  use  of  variance 
dispersion  graphs  (VDGs).  The  VDGs  consist  of  plots  of  the  maxima  and  minima 
of  the  prediction  variance  on  concentric  spheres  contained  inside  an  experimental 
region,  R.   Khuri,  Kim  and  Um  (1996)  proposed  using  plots  of  quantiles  of  the 
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prediction  variance  on  concentric  spheres  as  a  means  of  getting  more  information 
about  the  design's  prediction  capability  than  can  be  obtained  from  the  VDGs. 
The  quantile  plots  allow  complete  assessment  of  the  distribution  of  the  prediction 
variance  on  concentric  spheres  throughout  the  experimental  region,  R. 

In  the  linear  model  situation,  the  prediction  variance  depends  upon  the 
design  used  and  the  location  at  which  prediction  is  to  be  made.   In  GLMs  and 
nonlinear  models  the  mean-squared  error  of  prediction  (MSEP)  is  considered 
because  of  biased  parameter  estimates.  However,  the  MSEP  now  depends  upon  the 
prediction  location,  x,  the  design  settings,  and  on  the  unknown  model  parameters. 
More  specifically,  the  MSEP  depends  upon  the  unknown  elements  of  /3  in  the  linear 
predictor.  Let  td(x,/3)  =  MSE[/2(x)],  where  D  is  a  given  design.  After  a  design 
D  has  been  used  in  the  fitting  of  a  model,  the  prediction  capability  of  the  design 
can  be  investigated  by  using  an  estimate  of  td(x,0)  obtained  by  replacing  /3  with 
J3.  However,  if  no  estimate  of  /3  is  available  or  comparisons  of  several  designs  is  of 
interest,  some  subset  C  of  the  parameter  space  of  /3  must  be  considered  in  order  to 
assess  the  dependency  of  the  MSEP  on  the  unknown  parameters,  as  we  shall  now 
explain. 

Quantile  dispersion  graphs  of  the  MSEP,  which  we  shall  now  define,  can 
be  utilized  to  evaluate  and  compare  the  prediction  capability  of  designs:  Consider 
several  concentric  surfaces,  denoted  by  R\,  located  within  the  experimental  region, 
R.  These  surfaces  are  obtained  by  reducing  the  boundary  of  R  using  a  shrinkage 
factor  A.  The  prediction  capability  can  be  assessed  by  looking  at  the  distribution 
of  the  MSEP  on  each  of  several  of  these  concentric  surfaces.   More  specifically, 
the  distribution  of  MSEP  on  a  concentric  surface  is  fully  described  in  terms  of  its 
quantiles.  For  a  given  design  D  and  a  given  /3  €  C,  let  QD(p,ft,  A)  denote  the  pth 
quantile  of  the  distribution  of  values  of  td(x,  j3)  on  R\,  for  a  specified  value  of  the 
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shrinkage  factor  A.  Now  to  assess  the  dependency  of  the  quantiles  of  the  MSEP  on 
/3,  Qd(p,  0,  A)  is  computed  for  several  values  of  j3  forming  a  grid  of  points  inside  the 
region  C.  Let  us  then  consider  the  following  quantities: 

Qgiax(p,A)  =  maxQI)(p,/3,A)  (3.14) 

pec 

QTU(P,\)  =  ™™Qd(p,P,\)  (3-15) 

pec 

Plots  of  <2max(p,  A)  and  Q™m(p,\)  against  p  result  in  the  so-called  quantile 
dispersion  graphs  (QDGs)  of  the  MSEP.  Such  plots,  which  can  be  constructed  for 
each  of  several  values  of  A,  provide  a  comprehensive  assessment  of  the  prediction 
capability  of  a  design  D  throughout  the  region  R.  Several  designs  can  therefore  be 
compared  by  examining  their  corresponding  QDG  profiles.  Similar  QDGs  graphs 
were  used  by  Khuri  and  Lee  (1998)  for  the  comparison  of  designs  for  nonlinear 
models(Khuri  1997;  Lee  and  Khuri  1999). 

3.5     Numerical  Examples 
Example  1 

Carter  et  al.  (1986)  utilized  logistic  regression  in  a  cytotoxicity  study  to 
investigate  the  dose-response  curve  for  a  combination  of  two  agents.  Cell  cytotoxicity 
for  the  combination  of  the  two  agents,  methylmethanesulfonate  (MMS)  and  phorbol 
12-myristate  13-acetate  (PMA),  was  evaluated  by  counting  the  number  of  viable  and 
dead  cells.  Interest  was  in  modeling  tt(x),  the  probability  of  death,  as  a  function  of 
xi  =  concentration  of  MMS/10  and  x2  =  concentration  of  PMA. 

The  results  of  the  experiment  are  shown  in  Table  3.1.   Each  of  the  two 
agents  had  4  levels.  The  fitted  model  is  given  in  equation  (3.16)  and  its  estimated 
parameters  and  standard  errors  are  shown  in  Table  3.2. 
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Table  3.1:  Experimental  design  and  response  values  (Example  1) 


Xi 

X2 

Number  of 

Total  Number 

MMS/10 

PMA 

Dead  Cells  ru 

of  Cells  mu 

y*  ~  s£ 

0 

0 

19 

98 

0.19 

0 

1 

24 

87 

0.28 

0 

10 

54 

91 

0.59 

0 

100 

68 

87 

0.78 

1 

0 

16 

91 

0.18 

10 

0 

17 

90 

0.19 

25 

0 

19 

92 

0.21 

1 

1 

19 

94 

0.20 

1 

10 

41 

90 

0.46 

1 

100 

79 

93 

0.85 

10 

1 

10 

83 

0.12 

10 

10 

36 

85 

0.42 

10 

100 

62 

83 

0.75 

25 

1 

36 

92 

0.39 

25 

10 

56 

93 

0.60 

25 

100 

74 

87 

0.85 

1436 


log 


7r(x) 
1  —  7I"(x) 


=  A>  +  ftsi  +  &z2  +  fti*i  +  &2Z2  (3-16) 


The  experimental  region,  R,  of  the  agent  combinations  is  rectangular  with 
0  <  xx  <  25  and  0  <  x2  <  100.  Let  the  design  shown  in  Table  3.1  be  denoted  by  D\. 
To  illustrate  the  proposed  graphical  technique,  QDGs  can  be  utilized  to  investigate 
the  prediction  capability  of  Di  and  to  compare  it  against  two  other  designs,  D2 
and  D3.  The  latter  two  designs  have  the  same  settings  for  xi  and  x2  as  in  D\,  but 
differ  in  the  numbers  of  replications  at  the  combinations  of  x\  and  x2.  Design  D2 
is  balanced  in  the  sense  that  the  numbers  of  replications  are  equal  with  magnitudes 
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Table  3.2:  Parameter  Estimates  and  Model  Analysis  (Example  1) 


Parameter 

Estimate 

Std  Error 

P-Value 

A) 

-1.329935000 

0.1179 

<  0.0001 

ft 

-0.083535200 

0.0266 

0.0017 

& 

0.159119900 

0.0163 

<  0.0001 

Pn 

0.003883342 

0.0010 

0.0002 

023 

-0.001307519 

0.00016 

<  0.0001 

Scaled  Deviance  =  13.9195  DF  =  11 

comparable  to  those  for  Dx,  which  is  considered  nearly  balanced.  Design  D3  is 
unbalanced  with  widely  different  numbers  of  replications.  The  total  number  of 
replications  is  the  same  as  in  D2.  These  designs  are  given  in  Table  3.3. 

For  each  of  the  three  designs  in  in  Table  3.3,  we  consider  the  distribution  of 
td(x,  /3)  on  each  of  several  concentric  rectangles,  R\,  obtained  by  a  reduction  of  the 
boundary  of  the  experimental  region,  R,  using  a  shrinkage  factor  A,  0.5  <  A  <  1  (see 
Figure  3.1).  This  is  done  as  follows:  let  the  bounds  on  variable  X{  be  a*  and  b{  such 
that  di  <  Xi  <  bi,  i  =  1,2.  For  each  value  of  A,  the  corresponding  reduced  bounds 
on  Xi  are 

a*  +  (1  -  \){bi  -  di)  <  Xt  <  bi  -  (1  -  X)(bi  -  en). 

To  investigate  the  dependency  of  td(x,  0)  on  /3,  a  subset,  C,  of  the  parameter  space 
for  the  /3  vector  is  considered.  For  each  parameter,  a  range  consisting  of  the  point 
estimate  (given  in  Table  3.2)  plus/minus  two  standard  errors  was  considered.  The 
subset  C  was  obtained  by  selecting  3  points  within  each  parameter  range,  namely, 
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Table  3.3:  Designs:  D\,  D2,  and  D3  (Example  1) 


Xi 

X2 

A 

D2 

Ds 

MMS/10 

PMA 

Number  of  Cells 

Number  of  Cells 

Number  of  Cells 

0 

0 

98 

90 

15 

0 

1 

87 

90 

25 

0 

10 

91 

90 

35 

0 

100 

87 

90 

45 

1 

0 

91 

90 

55 

10 

0 

90 

90 

65 

25 

0 

92 

90 

75 

1 

1 

94 

90 

85 

1 

10 

90 

90 

95 

1 

100 

93 

90 

105 

10 

1 

83 

90 

115 

10 

10 

85 

90 

125 

10 

100 

83 

90 

135 

25 

1 

92 

90 

145 

25 

10 

93 

90 

155 

25 

100 

87 

90 

165 

1436  1440  1440 


IT) 
CVJ 
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the  point  estimate  and  the  two  end  points. 

A)  e  (-1.565667,-1.329935,-1.094203) 

ft  6  (-0.13676336,  -0.08353520,  -0.03030704) 

$3  €  (0.1264244,0.1591199,0.1918154) 

fti  €  (0.001821515,0.003883342,0.005945169) 

&2  e  (-0.0016228061,  -0.0013075190,  -0.0009922319) 

For  each  design  and  a  selected  value  of  /3  in  C,  quantiles  of  the  distribution  of 
td(x,0)  are  obtained  for  x  6  R\,  where  A  is  one  of  several  values  of  A  chosen  from 
the  interval  (0.5, 1].  The  number  of  points  chosen  on  each  R\  was  2000  consisting  of 
500  on  each  side.  The  quantiles  are  calculated  for  p  =  0(0.05)1.  The  procedure  is 
repeated  for  other  values  of  /3  in  the  subset,  C.  Then,  Q^a,x(p,  A)  and  Q™m{p,  A) 
are  calculated  using  formulas  (3.14)  and  (3.15).  The  S-PLUS  language  (Becker, 
Chambers,  and  Wilks  1988)  was  used  for  conducting  the  numerical  investigation  and 
obtaining  the  actual  plots. 

The  QDGs  for  the  comparison  of  designs  D\  and  D2  are  presented  in 
Figure  3.2.  The  plots  for  the  two  designs  overlap  so  closely  indicating  a  very  small 
difference  in  the  prediction  capabilities  for  Di  and  D2.  We  recall  that  D\  differs 
from  D<i  by  having  slightly  unequal  numbers  of  cells  for  the  agent  combinations.  The 
dependency  on  the  parameter  vector  /3  can  be  seen  in  the  separation  in  the  values  of 
Qmax  an(j  Qmin^  eSpeciaHy  near  the  center  of  the  experimental  region  (A  =  0.7,0.6). 
The  prediction  capabilities  of  the  designs  appear  to  be  sensitive  to  the  parameter 
values.  A  desirable  property  for  a  design  is  to  have  close  and  small  values  of  Qmax 
and  Qmm  over  the  range  of  p.  The  closeness  of  Qmax  and  Qmin  indicates  robustness, 
or  lack  of  sensitivity,  of  the  MSEP  to  the  parameter  values. 
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In  order  to  see  the  effect  of  extremely  "unbalanced"  cell  counts  at  the  agent 
combinations,  designs  D2  and  D3  are  compared  in  Figure  3.3.   Once  again,  the 
designs  appear  sensitive  to  the  parameter  values  especially  near  the  center  of  the 
experimental  region.  However,  the  balanced  design  D2  appears  to  be  somewhat 
more  robust  to  the  parameter  values  than  D3. 

Example  2 

Recognizing  the  dependency  of  design  criteria  on  model  parameters,  Sitter 
(1992)  proposed  a  minimax  procedure  to  obtain  designs  for  binary  data  that 
are  robust  to  poor  initial  parameter  estimates.   A  number  of  design  optimality 
criteria,  including  D-optimality,  were  considered.  This  minimax  criterion  resulted 
in  designs  with  more  design  points  and  greater  coverage  of  the  design  space  than 
traditional  design  criteria.  Experimental  situations  with  one  control  variable,  x, 
were  investigated.  The  probability  of  interest,  n(x),  was  modeled  as  a  function  of 
f3(x  —  /x).  The  corresponding  linear  predictor  is  /50  +  f$ix,  where  the  slope  parameter 
Pi  is  (3  and  the  intercept  parameter  /30  is  —  ftp.  Here,  /j  is  the  50%  response  dose, 
which  is  termed  ED50.  This  results  in  the  following  logistic  regression  model 

7r(xu) 


log 


0[xu  -  n]=  fi0  +  Pixu   u  =  1, 2, . . .  ,  n. 


1  -7r(xu) 

Let  xu  be  the  wth  design  point,  it(xu)  the  corresponding  probability,  and  mu  the 
number  of  observations  at  xu.  The  determinant  of  the  information  matrix,  |X'WX|, 
for  this  model,  is  equal  to  (Scazzero  and  Ord  1993,  p.  257) 


J^  ^2  mimj'K(xi){l  -  7r(a:i)]7r(rrj)[l  -  ir(xj)][xi  -  Xjf 


i<j 

Insight  into  the  D-optimality  criterion,  which  maximizes  this  determinant,  can 
be  gained  by  a  close  examination  of  this  result.  The  criterion  depends  upon  the 
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probabilities  as  well  as  the  design  points  specifically  through  their  spacings  [re,  —  Xj]2. 
For  the  D-optimal  design  consisting  of  two  design  points  the  determinant  reduces  to 

mim2ir(xi)[l  -  Tt(x1)]Tr(x2)[l  -  ir(x2)][xi  -  x2]2. 

Sitter's  (1992)  minimax  approach  amounts  to  finding  a  design  that  minimizes  the 
maximum  of  the  reciprocal  of  this  determinant  over  some  parameter  space  of  \i 
and  /?.  Note  that  the  determinant  is  close  to  0  if  either  7r(a;i)  or  n(x2)  is  near  0  or 
1.  In  this  situation,  the  D-optimality  criterion  becomes  difficult  to  interpret  and 
calculating  the  inverse  of  the  information  matrix  can  become  problematic.  The 
inverse  of  the  information  matrix  is  used  in  calculating  the  MSEP,  (see  equation 
(3.13)).   Even  though  the  design  space  may  be  unbounded  in  an  experimental 
situation,  Atkinson  (1995)  noted  that  designs  for  logistic  regression  models  are 
constrained  such  that  the  probabilities  were  not  too  close  to  zero  or  one.  For  the 
comparison  of  designs,  it  will  also  be  necessary  to  be  cautious  in  the  selection  of  the 
parameter  space  since  the  probabilities  also  depend  upon  the  model  parameters. 

Sitter  (1992,  p.   1151)  presented  an  illustration  of  the  minimax  procedure. 
A  logistic  regression  model  was  utilized  in  a  study  to  investigate  the  economic 
value  of  sport  fishing  in  British  Columbia.  The  control  variable  was  the  amount 
of  increased  fishing  cost,  x,  and  the  binary  response  consisted  of  fishing/not 
fishing.    Probability  of  fishing  was  to  be  modeled  as  a  function  of  increased 
fishing  cost.  The  D-optimal  design  for  initial  parameter  estimates  /j,0  =  40  and 
/30  =  0.9  consisted  of  two  design  points  x  =  38.28,41.72.  The  minimax  D-optimal 
design  produced  by  Sitter's  minimax  procedure  consisted  of  the  7  design  points 
x  =  31.72,34.48,37.24,40.00,42.76,45.52,48.28.  The  minimax  design  was  to  be 
robust  over  the  parameter  space  given  by  33  <  /j,  <  47  and  0.5  <  /3  <  1.25.  Both 
designs  assume  equal  number  of  observations  at  the  design  points. 
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Sitter  noted  that  the  minimax  D-optimal  design  performed  better  than 
the  D-optimal  design  for  most  of  the  parameter  space.    Sitter  also  noted  the 
performance  of  the  D-optimal  design  was  very  sensitive  to  the  initial  estimate  of 
\i.  The  D-optimal  design  was  better  than  the  minimax  D-optimal  design  for  initial 
estimates  of  fi  that  were  nearly  correct.  As  noted  before,  the  D-optimality  criterion 
for  the  two  point  D-optimal  design  can  be  affected  if  the  probabilities  get  close 
to  zero  or  one.  For  this  example,  the  probabilities  for  the  D-optimal  design  equal 

^(38.28)  =  i+exP(-/3[38.28-M])  and  '(41.72)  =  1+exp(_/[41,72_/i])  •  Note  that  as  ft  moves 
away  from  the  initial  estimate  of  40,  the  probabilities  can  get  very  close  to  zero 
or  one  regardless  of  the  value  of  (3.  Thus  over  much  of  the  parameter  space,  the 
D-optimal  design  has  a  determinant  close  to  zero.  As  a  result,  the  parameter  space 
for  [i  was  restricted  to  38  <  ft  <  42  in  order  to  compare  the  designs  in  terms  of 
prediction  capability. 

Note  also  the  presence  of  ir(x)  and  1  —  ir(x)  in  the  MSEP  expression  in 
equation  (3.13).  We  should  therefore  be  cautious  about  probability  values  near  zero 
or  one.  An  experimental  region,  R,  30  <  x  <  50  was  then  considered  to  control  the 
probability  values  and  to  agree  with  the  actual  regions  covered  by  the  two  designs. 

The  following  is  a  graphical  comparison  of  the  prediction  capabilities  of 


the  two  designs  for  the  model  log 


7r(gu) 


Po  +  P\xu    u  =  l,2,...,n.  The 


l-7r(xu) 

parameter  space  C  for  (A),  A)  is  trapezoidal  and  consists  of  0.5  <  fii  <  1.25  and 
— /3i(42)  <  f3o  <  — /?i(38).  This  parameter  space  corresponds  to  38  <  //  <  42  and 
0.5  <  ft  <  1.25  where  the  slope  parameter  fix  is  f3  and  the  intercept  parameter  f3Q 
is  — /3/i.  The  experimental  region  investigated  was  30  <  x  <  50  and  the  designs 
consisted  of  70  runs  with  equal  weight  to  the  design  points  .  For  selected  parameter 
values  (A),/?i)  in  C,  the  MSEP  [see  equation  (3.13)]  can  be  calculated  throughout 
the  experimental  region.  Since  in  this  example  there  is  only  one  control  variable,  no 


40 

shrinkage  of  the  experimental  region  was  necessary.  Minimum  and  maximum  MSEP 
plots  for  points  in  the  experimental  region  can  be  constructed  where  the  minimum 
and  maximum  are  obtained  over  the  parameter  space  C.  The  plots  for  the  two 
designs  are  given  in  Figure  3.4.  We  can  see  that  the  D-optimal  design  does  well  near 
the  center  of  the  experimental  region  while  the  minimax  D-optimal  design  performs 
better  away  from  the  center.  The  quantile  dispersion  graphs  for  the  two  designs  are 
given  in  Figures  3.5  and  3.6.    From  these  plots,  it  appears  that  the  dispersion  of  the 
quantiles  for  the  minimax  D-optimal  design  is  less  than  the  D-optimal  design.  From 
this,  it  appears  that  the  minimax  design  is  somewhat  more  robust  to  the  parameter 
values.  Figure  3.7  contains  an  overlaying  display  of  the  QDGs  for  the  two  designs. 
It  is  evident  that  the  minimax  design  performs  slightly  better  than  the  D-optimal 
design.  Sitter  (1992)  claimed  that  the  minimax  design  performed  much  better  over 
a  large  portion  of  the  parameter  space.   His  claim  can  be  partly  understood  by 
considering  the  sensitivity  of  determinant  of  the  two-level  D-optimal  design  to  the 
initial  estimate  of  fj,.  For  Sitter's  parameter  space  the  two-level  D-optimal  design 
has  a  determinant  close  to  zero  for  much  of  the  parameter  space  of  //.  However,  our 
graphical  analysis  used  a  restricted  parameter  space,  as  was  seen  earlier,  to  allow 
for  a  fair  comparison  of  the  two  designs.  In  this  analysis,  the  minimax  D-optimal 
design  still  appears  to  perform  better  than  the  D-optimal  design,  but  not  to  the 
level  claimed  by  Sitter. 

Example  3 

Piegorsch  (1998)  provided  an  introduction  to  modeling  for  data  which  are 
proportions  often  encountered  in  engineering/quality  settings.   He  presented  an 
example  involving  alloy  fasteners  used  in  aircraft  construction  (Montgomery  and 
Peck  1982,  sec  6.3).  Interest  was  in  modeling  the  probability  of  fastener  failure 
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Table  3.4:  Experimental  design  and  response  values  (Example  3) 


X  = 

coded  x  = 

mu  = 

y  = 

Pressure  Load 

2X-6800 
isnn 

Total  Fasteners 

Number  Failing 

2500 

-1 

50 

10 

2700 

-7/9 

70 

17 

2900 

-5/9 

100 

30 

3100 

-3/9 

60 

21 

3300 

-1/9 

40 

18 

3500 

1/9 

85 

43 

3700 

3/9 

90 

54 

3900 

5/9 

50 

33 

4100 

7/9 

80 

60 

4300 

1 

65 
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Table  3.5:  Parameter  Estimates  and  Model  Analysis  using  coded  x  (Example  3) 


Parameter  Estimate     Std  Error    P- Value 

Jo  -0.0750  0.0830       0.3657 

fa 1.3936  0.1418       0.0001 

Scaled  Deviance  =  0.3719  DF  =  8 

as  a  function  of  pressure  load.  The  results  of  the  experiment  are  given  in  Table 
3.4.  10  different  pressure  were  considered  with  the  number  of  fasteners  and  failures 
recorded.  The  fitted  model  is  given  in  equation  (3.17)  and  its  estimated  parameters 
and  standard  errors  are  shown  in  Table  3.5. 


log 


ir(x) 
1  —  n(x) 


fa+fa  (3.17) 


The  D-optimal  design  for  experimental  situations  using  model  (3.17) 
is  given  by  Minkin  (1987).    This  design  is  a  two  point  design  with  x- values: 
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Xi  =  -(1.5434  +  /30)//?i  and  x2  =  (1.5434  -  A>)/A-  This  design  illustrates  the 
design  dependency  problem.  Using  the  estimates  from  Table  3.5,  in  retrospect, 
results  in  X\  =  -1.053655  and  x2  =  1.161344  which  are  values  outside  the  explored 
experimental  region. 

To  investigate  the  prediction  capabilities  of  the  design,  it  will  be  compared 
to  a  design  with  the  same  pressure  loads  but  with  equal  replication  (69  fasteners)  at 
each  of  the  loads.  The  design  in  Table  3.4  will  be  denoted  as  Da  and  the  balanced 
design  as  D^.  The  parameter  space  investigated  consisted  of  the  point  estimates, 
point  estimates  ±  standard  error  and  point  estimates  ±  2  standard  errors.  Figures 
3.8  and  3.9  contain  the  max  and  min  plots  of  SMSEP  and  the  QDGs  of  the 
SMSEP  for  the  two  designs.  It  is  evident  that  the  two  designs  exhibit  very  similar 
prediction  capabilities.  There  is  considerable  variation  in  the  SMSEP  throughout 
the  experimental  region,  being  low  near  the  center  and  larger  near  the  boundaries. 
The  designs  do,  however,  appear  to  have  a  robustness  to  the  parameter  values. 

3.6     Conclusions 

The  proposed  graphical  technique,  based  on  using  the  QDGs  for  the  mean 
squared  error  of  prediction,  provides  a  procedure  for  evaluating  and  comparing 
designs  for  GLMs.   The  QDGs  provide  information  concerning  the  prediction 
capability  of  a  design  and  its  sensitivity  to  the  parameter  values.  This  information 
is  useful  in  the  evaluation  and  comparison  of  designs. 

Myers  (1999),  in  considering  designs  for  GLMs,  stated  that  standard  response 
surface  designs  may  work  well  in  such  experimental  situations.   Because  of  the 
dependency  of  design  performance  on  the  model  parameters,  it  is  difficult  to  judge 
the  performance  of  such  designs.  Myers  reiterates  the  need  for  robust  designs  that 
are  efficient  despite  the  dependency  on  model  parameters.  Further  work  with  QDGs 
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and  other  graphical  techniques  may  be  useful  in  evaluating  standard  response 
surface  designs  and  in  the  search  for  robust  designs. 


CHAPTER  4 

EVALUATION  AND  COMPARISON  OF  STANDARD  RSM 

DESIGNS  UNDER  NON-STANDARD  CONDITIONS 

4.1     Introduction 

The  development  of  a  graphical  technique  for  the  evaluation  and  comparison 
of  experimental  designs  for  logistic  regression  models  was  presented  in  chapter  3. 
The  Quantile  Dispersion  Graphs  (QDGs)  give  a  graphical  representation  of  the 
prediction  capabilities  of  a  design  based  on  the  distribution  of  the  mean-squared 
error  of  prediction.  They  also  convey  information  about  the  effects  of  the  design, 
location  of  prediction,  and  parameter  values  on  the  prediction  capability  of  a 
particular  experimental  situation. 

This  chapter  examines  the  implementation  of  the  QDGs  for  the  evaluation 
and  comparison  of  classical  RSM  designs  when  utilized  in  non-standard  conditions. 
A  discussion  is  also  presented  concerning  the  General  Equivalence  Theorem  and  the 
sequential  generation  of  optimal  designs. 

4.2    Preliminary  Results 
Extensive  research  has  been  done  in  the  area  of  experimental  design  for  linear 
models  with  the  traditional  error  assumptions  of  normality,  independence,  and 

constant  variance.  For  the  first-order  model, 

k 

y  =  $>  +  ^2  @iXi  +  e' 

fcsl 

where  Xi,x2,-..  ,  re*  is  a  set  of  control  variables,  the  class  of  orthogonal  designs  has 
many  desirable  properties.  This  class  includes  2k  factorial  designs,  certain  fractions 


50 


51 

thereof,  as  well  as  simplex  and  Plackett-Burman  designs  (Khuri  and  Cornell  1996, 
chap.  3).  For  the  second-order  model, 

*  k  fc-l        k 

t=l  i=\  i=\  i<j  j=2 

examples  of  common  designs  include  the  3fe  factorial  designs,  Box-Behnken  designs, 
and  central  composite  designs  (CCD)  (Khuri  and  Cornell  1996,  chap.  4). 

In  a  non-standard  situation,  when  the  response  is  not  measured  on  a 
continuous  scale,  or  when  the  traditional  assumptions  fail,  a  researcher  may  continue 
to  consider  using  these  classical  RSM  designs.  In  such  situations,  how  do  these 
designs  measure  up?  Myers  (1999,  p.  34)  states  that  standard  RSM  designs  may 
work  well  in  these  situations,  but  wonders,  "How  would  one  know?" 

The  QDGs,  which  were  developed  in  chapter  3,  can  be  used  to  evaluate  and 
compare  the  prediction  capabilities  of  classical  RSM  designs.  In  the  comparison  of 
these  RSM  designs  it  is  not  only  of  interest  to  compare  designs  against  one  another, 
but  also  to  compare  the  performance  of  the  designs  to  an  "optimal"  design.  Myers 
(1999,  p.  40)  states  that  any  comparison  of  designs  must  include  a  comparison  with 
an  "optimal"  design.   Basically,  the  optimal  design  will  serve  as  a  base-line  or  a 
reference  point  with  which  to  compare  the  designs. 

4.2.1     Design  Theory 

The  concepts  of  design  measure  and  moment  matrix  are  important  ideas 
in  the  theory  of  optimal  designs  and  will  be  discussed  in  this  section.  Khuri  and 
Cornell  (1996,  chap.  12,  sec.  2)  provide  an  introduction  to  optimal  design  theory. 

An  experimental  design  over  an  experimental  region,  R,  can  be  defined 
as  a  probability  measure  and  is  referred  to  as  a  design  measure,  £.  The  design 
measure,  C,  satisfies  the  two  properties  £(x)  >  0  for  all  x  e  R  and  fRd((x)  =  1. 
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A  design  measure  is  said  to  be  discrete  if  it  consists  of  n  points  in  R  where 
((xk)  =  mk/N,  if  xk  is  a  design  point,  and  0  otherwise  (k  =  1, 2, . . .  ,  n).  Note,  that 
mk  denotes  the  number  of  observations  being  collected  at  the  fcth  distinct  design 
point  (k  =  1, 2, . . .  ,  n)  and  N  =  YH=i  mk-  AU  otner  design  measures  are  called 
continuous  design  measures. 

For  the  definition  of  the  moment  matrix,  consider  a  linear  model  of  the  form 

y  =  f(x)(3  +  e. 

The  moment  matrix  of  a  design  measure  £  is 

M(C)  =  K(C)]  =  [  /  /i(a)/i(*K(«)] 

JR 

where  fi(x)  is  the  ith  element  of  f(x)  evaluated  at  a  point  xeR(i  =  l,2,...,p). 
For  a  discrete  design  C,  it  follows  that  M«)  =  £Li  =*■  f(xk)f'{xk)  =  jjX'X 
where  X  results  from  expressing  the  linear  model  in  matrix  notation,  that  is, 
y  =  X/3  +  e.   Using  the  moment  matrix,  the  standardized  prediction  variance 
function  is  defined  to  be  d(x,()  =  f'(x)[M(C)}'lf{x). 

A  design  measure  is  defined  to  be  D-optimal  if  it  maximizes  |M(()|  over  the 
class  of  all  design  measures.  A  design  measure  is  G-optimal  if  it  minimizes  over  the 
class  of  all  design  measures  the  maximum  standardized  prediction  variance  over  the 
experimental  region  R. 

4.2.2     Equivalence  Theorem  and  Extensions  for  GLMs 

The  Equivalence  Theorem  of  Kiefer  and  Wolfowitz  (1960)  established  a 
relationship  between  D-optimality  and  G-optimality.  This  result  has  been  utilized 
in  the  evaluation  of  designs  as  well  as  the  construction  of  designs  (Wynn  1970). 
The  Equivalence  Theorem  basically  establishes  three  design  properties  that  are 
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equivalent  and,  hence,  occur  simultaneously.  When  applied  to  linear  models,  the 
equivalence  theorem  gives  the  result  that  a  D-optimal  design  measure  £*  satisfies 
the  following  properties  simultaneously: 

C*  maximizes  |M(£)|,over  the  class  of  all  design  measures 

C*  minimizes  maxxd(x,  C),  over  the  class  of  all  design  measures 

maxxd(x,C)  =  P, 

where  for  linear  models,  d(x,()  =  f  \x)[M (C)]_1 f  (as)  is  the  standardized  prediction 
variance,  and  p  is  the  number  of  parameters  in  the  linear  model.  This  result  gives 
the  equivalence  of  D-  and  G-optimal  designs  for  linear  models.   Note  that  for  a 
discrete  design  measure,  d(x,()  =  ^Var[y(x)],  where  Var[y(a;)]  is  the  prediction 
variance  at  x. 

White  (1973)  provided  an  extension  of  the  Equivalence  Theorem  to  nonlinear 
models  by  giving  a  generalization  of  the  variance  function,  d(x,Q-   Noting  the 
dependence  of  designs  upon  the  parameters  of  a  given  nonlinear  model,  White 
defines  a  design  measure  (*  to  be  D(/3)-optimal  if  it  maximizes  the  determinant  of 
M((,0)  for  a  given  /3  over  the  class  of  all  design  measures.  The  generalization  of 
d(x,  £)  is  given  by 

d(x,  C,  0)  =  trace{  J(z,  /3)[M(C,  P)}'1},  (4.1) 

where  I{x,P)  is  the  information  matrix  at  the  point  x.  The  (i,  j)th  element  of 

J(z,/3)is 

E[(— )(— )] 

where  L  is  the  log  of  the  probability  density  function  or  probability  mass  function 
associated  with  a  random  variable  observed  at  the  point  x  (White  1973,  p.  345). 
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The  information  matrix  at  x  for  GLMs  can  be  written  as  (Atkinson  and  Haines 
1996,  p.  456) 

r<»-«=vS5wikw/(")m  <4'2) 

The  Equivalence  Theorem  can  be  extended  by  defining  a  design  to  be  G(/3)-optimal 
if  it  minimizes  the  maxxd{x,  £,  0)  for  a  given  (3  over  the  class  of  all  design  measures. 
The  Extended  Equivalence  Theorem  given  by  White  states  that  the  following  three 
properties  of  a  design  measure  £*  are  equivalent 

C*  is  D(/3)  optimal,  over  the  class  of  all  design  measures 
C  is  G(/3)  optimal,  over  the  class  of  all  design  measures 
maxxd(x,C/3)  =  p. 

Application  of  the  Extended  Equivalence  Theorem  to  GLMs 

For  generalized  linear  models,  the  response  y(x)  has  a  distribution  in 
the  exponential  family  with  a  mean,  E[?/(a!)]  =  n(x).    The  linear  predictor, 
r)(x)  =  f'{x)fi  is  related  to  the  mean  through  a  strictly  monotonic  increasing  link 
function,  g(-).  The  relationship  is  r)(x)  =  g[v{x)}  =  /'(x)/3,  and  using  the  inverse 
function  h(-)  =  <7-1(-)  gives  the  result  that  fj.(x)  =  h(rj(x)).  The  model  for  the 
response  can  be  written,  as 

y{x)    =    n{x)  +  e 

=    h[V(x)]  +  e, 

where  the  random  error,  e,  has  a  distribution  in  the  exponential  family  (see  chapter 
3,  section  3.2,  equation  3.2).  The  moment  matrix  for  GLMs  for  a  discrete  design 
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measure  C  is 

M^^X'WX^t^^fMfM,  (4.3) 

where  <j>  is  the  dispersion  parameter,  the  summation  is  over  the  n  distinct  design 
points  with  replication,  ra^,  N  =  Ylk=i  mfc'  an<^  $^  1S  a  diagonal  matrix  whose  uth 

diagonal  element  is     , 1  m  =  xr — 1~, — },   ,  , — :tttt,  u  =  1,  2  . . .  ,N.  The  information 
o  u)(x„,^)       Var[2/(xu)]{5'[M(xu)]}2'  ' 

matrix,  I(x,/3),  at  a;  is  defined  as       *  *  f(x)f'(x)  (see  equation  (4.2)).  Using  this, 

w(x,p) 

the  variance  function  of  White  (1973),  (4.1)  can  be  written  as  follows 

d(x,(,P)    =    traceil^MMtC/S)]-1} 

=    tracej^-^/^)/'^)^^,^)]-1} 
f'(x)[M(03)}-lf(x) 


u>(*,/9) 


A  motivation  for  the  above  result  can  be  given  by  recalling  that  in  linear  models, 
the  standardized  prediction  variance  is  d(x,  ()  =  ^Var[y(x)]  for  a  discrete  measure 
£.  For  GLMs  there  is  a  choice,  however,  concerning  the  use  of  either  the  prediction 
variance  of  the  linear  predictor,  T](x),  or  the  estimated  mean  response,  p,(x).  The 
following  result  establishes  that  the  two  approaches  are  equivalent  and  agree  with 
the  variance  function  of  White  (1973). 

Under  the  GLMs  setting,  consider  f)(x)  =  f'{x)J3  and  ji{x)   = 
h[fj(x)].   The  prediction  variance  of  7)(x),  Varfr^x)],  is  approximately  equal  to 
jf'(x)[X'WX]~1f(x),  and  the  prediction  variance  of  jl(x)  using  first-order 
Taylor's  series  is  Var(/x(as))  m  \f'(x)[X'WX\-lf{x){h'{r){x)]}2.  Applying  the 
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analogue  of  the  scaling  from  linear  models  results  in 

rVar[A(*)]    =    f(z){M(CM-lf(*f 


Var[j/(x)]  '  Var[?/(x)] 

=    f(x)[M(C,0)-lf(m) 


VM[y(x)}[g>(»(xW 
f'(x)[M(U3)-lf(x) 


w(x,/3) 
=    d(x,C,/3). 

Therefore,  the  two  approaches  of  getting  d(x,(,P)  are  equivalent. 
Example     Logistic  Regression  Models 

Binary  response  data  are  commonly  modeled  by  using  logistic  regression 
models.    The  binary  response,  y{x),  has  the  binomial  distribution  with 
index,  1,  and  probability  of  success,  ir(x).    The  mean  of  the  response  is 
n(x)  =  ir(x)  and  the  variance  is  Var[y(x)}  =  tt(x)[1  —  n(x)].    Using  the 
logistic  link  function  gives  log   yz^y    =  f'{x)(3  =  r)(x).  The  weight,  w(x,P), 
equals  n/X)\i-K<x)]  anc*  t^ie  moment  matrix  for  a  discrete  design  measure  £  is 
M((,  $)  =  jfX'WX  =  Zl=l  $*(**)[!  -  7r(xk)]f(xk)f'(xk),  where  mk  if  the 
number  of  experimental  units  at  the  fcth  distinct  experimental  run  (Jb  =  1,2, ...  ,  n). 
The  standardized  prediction  variance,  d(x,(,0),  follows  as 

w(x)[l  -  n(x)}f'(x)[M(C,/3)]-lf(x)  m  N*(x)[l  -  n(x)]f(x)[X'WX]^f{x). 

(4.4) 

This  is  the  appropriate  variance  function  for  using  the  Extended  Equivalence 
Theorem  and  establishing  optimal  design  results.  Note  that  since  the  variance  of  the 
response  is  no  longer  constant,  scaling  of  the  prediction  variance  does  not  remove 
the  variance  term  completely.  The  presence  of  the  binomial  variance,  7r(a;)[l  -  ir(x)], 
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is  evident  in  the  variance  function.  The  variance  function  incorporates  the  variance 
of  the  response,  y(x),  at  the  location,  x. 

4.2.3    Sequential  Generation  of  D-optimal  Designs 

Using  the  Equivalence  Theorem  of  Kiefer  and  Wolfowitz  (1960),  Wynn  (1970) 
proposed  a  sequential  procedure  for  the  generation  of  D-optimal  designs  for  a  linear 
model.  Wynn's  procedure  generates  a  sequence  of  designs  as  follows: 

1.  Start  with  an  initial  discrete  design,  D°,  with  n0  points,  xx,  x2, . . .  ,  xno,  which 
support  the  fitting  of  the  linear  model. 

2.  Find  a  point,  x°,  which  maximizes  the  variance  function, 

d(x,D°)  =  f'(x)[M(D°)]-'f(x). 

Augment  D°  by  adding  the  point  a;0  to  get  the  design  D1. 

3.  Repeat  step  (2)  to  obtain  a  sequence  of  designs  D°,  D1, . . .  ,  Dn, . . .  where  Dn 
is  obtained  from  Dn~x  by  adding  a  point  of  maximum  standardized  prediction 
variance  based  upon  using  Dn~x. 

Wynn  established  that  this  sequence  of  designs  approaches  the  D-optimal  design.  In 
practice,  the  procedure  is  performed  until  "convergence"  of  the  variance  function  to 
p  is  attained. 

A  natural  extension  of  the  sequential  procedure  for  the  generation  of  designs 
for  GLMs  and  nonlinear  models  would  involve  the  use  of  the  Extended  Equivalence 
Theorem  and  the  standardized  prediction  variance  function.  The  procedure  would 
require  a  specification  of  initial  parameter  values  and,  therefore,  the  resulting  design 
would  be  locally  optimal. 
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Sequential  Generation  of  Design  -  Single- Variable  Logistic  Model 


For  the  single-variable  logistic  model,  log 


ir(x) 


=  /30  +  /3ix,  the  D-optimal 


1  — 7r(x) 

design  is  known  to  be  a  two-point  symmetric  design  with  equal  replication  at  the 
x  values  corresponding  to  effective  dose  (ED)  probabilities  of  0.824  and  0.176 
(Abdelbasit  and  Plackett  1983;  Minkin  1987). 

To  illustrate  the  sequential  generation  of  designs,  the  procedure  was  applied 
to  the  single-variable  logistic  model.  Initial  parameter  values  of  0  for  the  intercept, 
Po,  and  1  for  the  slope,  /%,  were  used  resulting  in  the  model,  log    ;r5a     =  x- 
Let  f3°  =  (0, 1).    The  D-optimal  design  is  then  to  have  equal  replications  at 
logjgj  =  -1.5437  =  %x  and  log|ff|  =  1.5437  =  x2.  The  design,  D°,  consisting  of 
the  3  points  of  -1,  0,  and  1  with  equal  weights,  i.e.  jfj-  =  |,  was  used  as  the  initial 
design  and  the  experimental  region,  R,  was  selected  to  be  [—1,1].  The  standardized 
prediction  variance  function  in  (4.4)  was  utilized.  The  maximum  of  d(x,D°,(3°)  over 
R  is  equal  to  2.417  and  is  attained  at  the  location  x  =  —1.  This  value  was  added  to 
the  initial  design  to  obtain  the  design  D1.  Application  of  the  sequential  procedure 
either  added  the  value  of  x  =  —  1  or  1  to  the  previous  design.  The  design  with  25 
points  resulted  in  a  maximum  variance  function  of  2.031  while  the  design  with  75 
points  resulted  in  a  value  of  2.010.  As  can  be  seen,  the  D-optimal  design  consists 
of  points  located  on  the  boundary  of  the  experimental  region,  as  is  the  case  with 
D-optimal  designs  for  linear  models. 

The  above  procedure  was  performed  again  using  an  experimental  region 
[—2, 2]  and  the  same  initial  3-point  design,  D°.  The  initial  design  had  a  maximum 
variance  function  value  of  3.694  at  the  location  x  =  2.  This  point  was  added  to  the 
initial  design  and  the  resulting  design  had  a  maximum  variance  function  value  of 
3.460  at  the  point  x  =  —1.94.  The  sequential  generation  of  designs  was  continued 
until  a  77-point  design  had  a  maximum  variance  function  value  of  2.010.   The 
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Table  4.1:  Frequency  Table  For  Sequential  Design  Points 


X 

Frequency 

Percent 

-1.94 

1.3 

-1.78 

1.3 

-1.68 

1.3 

-1.64 

1.3 

-1.6 

1.3 

-1.58 

2.6 

-1.56 

9.1 

-1.54 

23 

29.9 

-1 

1.3 

0 

1.3 

1 

1.3 

1.54 

22 

28.6 

1.56 

9.1 

1.58 

2.6 

1.6 

1.3 

1.64 

1.3 

1.68 

1.3 

1.76 

1.3 

1.94 

1.3 

2 

1.3 

structure  of  the  design  can  be  seen  in  Table  4.1.  The  design  has  converged  towards 
the  D-optimal  design  of  equal  replication  at  xx  =  -1.5437  and  x2  =  1.5437.  The 
starting  design  points  of -1,  0,  and  1  are  clearly  seen  to  be  very  poor  points.  Recall 
that  this  design  is  a  locally  D-optimal  design  for  the  initial  parameter  values  of 
/?o  =  0  and  ft  =  1.  Atkinson  and  Donev  (1992,  p.   119)  note  that  at  this  point 
there  are  several  approaches  for  more  precisely  finding  the  D-optimal  design.  These 
approaches  include  exchange  algorithms  in  which  the  poor  points  from  the  initial 
design  are  dropped  and  the  sequential  procedure  is  started  again.  Other  possibilities 
include  analytical  optimization,  if  possible,  and  numerical  methods  which  use  the 
design  from  the  sequential  procedure  as  a  starting  point. 
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4.3    Scaled  Mean-Squared  Error  of  Prediction 
In  chapter  3,  the  development  of  the  QDGs  involved  the  use  of  the 
mean-squared  error  of  prediction  (MSEP).  The  MSEP  incorporates  both  the 
prediction  variance  and  prediction  bias. 

MSE[/i(x)]  =  Var[A(«)]  +  {Bias[fi(x)}}2 . 

The  Equivalence  Theorem  establishing  the  relationship  between  D-optimality  and 
G-optimality  uses  a  standardized  prediction  variance  function.  For  linear  models, 
this  function  results  in  a  quantity  that  is  free  of  scale  and  is  placed  on  a  per 
observation  basis.  Giovannitti-Jensen  and  Myers  (1989),  in  their  development  of 
variance  dispersion  graphs,  utilized  the  variance  function,  d(x,Q,  for  linear  models. 
Recently,  Khuri  and  Lee  (1998)  proposed  quantile  plots  of  the  estimated  scaled 
mean-squared  error  of  prediction  for  the  evaluation  and  comparison  of  designs  for 
nonlinear  models.  The  scaled  mean-squared  error  of  prediction  is  given  by 

The  first  portion  can  be  recognized  as  the  standardized  prediction  variance  function, 
d(x, (,,()).  This  portion  depends  on  the  location  of  prediction,  x,  the  parameters, 
/3,  and  the  experimental  design.  This  quantity  can  be  evaluated  by  using  the  design 
weights,  j*-,  which  enter  in  the  moment  matrix,  M{C,,j3).  From  equation  (4.3),  we 
note  that  the  moment  matrix  depends  only  upon  the  design  weights  and  the  distinct 
design  points  of  the  experimental  design.   Working  with  the  design  weights  gives 
flexibility  in  the  consideration  of  designs  and  removes  the  dependency  upon  N  since 
the  computation  of  d(x,  C,  0)  would  only  require  knowledge  of  the  distinct  design 
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points  and  the  design  weights.  Rewriting  the  second  portion  in  (4.5)  as 

N 


{Bias[/i(x)]}2  =  {JVBiast/Ha?)]}' 


Var[2/(a;)]1         LrA    ;JJ         NVar[y(x)] 

also  allows  the  computation  of  NBias[fi(x)]  in  terms  of  the  design  only  through  the 
distinct  design  points  and  the  design  weights.  In  order  to  see  this,  consider  that 

iVBias[/2(z)]  =  ATBiasfrKa;)]  j*  +  ~  Varfrf*)]  ~  (4.6) 

O/i  ^  tt/7 

(see  chapter  3,  section  3.2,  equation  3.8).  We  consider  the  two  portions  of  equation 
(4.6)  separately.  Prom  the  first  portion,  consider  iVBias[r/(x)]  =  N  f  (x)Bia,s((3) .  It 
needs  to  be  shown  that  ./VBias(/3)  can  be  calculated  using  the  distinct  design  points 
and  the  design  weights.  From  section  3.2  of  chapter  3  equation  (3.5)  it  follows  that 

7VBias[/3]    =    N{-^-){X'WX)-lX'ZdFlN 


<-£' 


x'wx 

N 


-1 


X  ZdFlpf 


=    (-ip#(C,0)]-1M*Iljr 

The  moment  matrix  can  again  be  computed  using  the  distinct  design  points  and  the 
design  weights.  It  needs  to  be  established  that  X' ZdFlN  can  be  computed  using  the 
distinct  design  points  and  design  weights  as  well.  Zd  is  a  diagonal  matrix  which  can 
be  written  as  a  block  diagonal  matrix  with  the  matrices  Zkklmk  along  the  diagonal, 
where  zkk  is  the  element  (see  chapter  3,  section  3.2,  equation  3.5)  associated  with 
the  feth  distinct  design  point  with  replication,  mfe.  F  is  also  a  block  diagonal  matrix 
with  the  matrices  fkklmk  along  the  diagonal,  where  /fefc  is  the  element  (see  chapter 
3,  section  3.2,  equation  3.5)  associated  with  the  fcth  distinct  design  point  with 
replication,  mk.  X'ZdFlN  can  now  be  expressed  as  53fc=i  mkZkkfkkf{xk)  where  the 
summation  is  over  the  n  distinct  design  points.  In  order  to  use  the  design  weights, 
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we  need  jf  in  the  expression.  Consider  using  Ylu=imk^zkkfkkf(xk)-  Note  that 
Nzkk  is  equal  to  f'(xk)         ^  /(**)    (*  =  1>  2>  •  •  •  > n)-  It;  finally  follows  that 

X'ZdFlN  can  be  calculated  using  the  distinct  design  points  and  the  design  weights. 
Considering  now  the  second  portion  of  equation  (4.6),  note  that 

N 


NV<ir[fi(x)}    =    -f(x) 
<t> 


x'wx 


/(•) 


Therefore,  the  second  portion  of  equation  (4.6)  depends  upon  the  moment  matrix, 
M(C,/3)  and  can  be  computed  without  knowledge  of  N.  Note,  however,  that  the 
prediction  bias  portion  of  the  scaled  mean-squared  error  of  prediction  (SMSEP) 
does  depend  upon  the  total  sample  size,  N.  The  researcher  must  therefore  have 
some  idea  about  the  number  of  experimental  runs  available  for  the  experiment.  If 
the  researcher  is  only  interested  in  using  the  standardized  prediction  variance  the 
comparisons  can  be  performed  without  the  specification  of  N. 

4.4    SMSEP  for  the  Logistic  Regression  Model 
For  binary  response  data  modeled  using  the  logistic  link  function,  the 
mean-squared  error  of  prediction  is  given  by 

MSE[tt(x)]  =  Var[vr(x)]  +  (Bias^x)]}2. 

The  appropriate  scaling  for  this  situation  is  ir(x)\i'-ir(X)]  w^c^  *s  the  tota^  sample 
size  divided  by  the  binomial  variance  for  a  single  observation  at  the  location,  x.  The 
scaled  mean-squared  error  of  prediction  takes  the  form 

—  N      ,y*»w).  y?, ,  +  ffi 5jfi M'>'       (4.7) 


7r(x)[l  -  »(*)]  7r(x)[l  —  ir(x)]       Ntt(x)[1  —  tt(x)} 
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The  SMSEP  depends  upon  the  unknown  /3,  the  design,  and  the  control  variables, 
i.e.  the  elements  of  x.  Let  td{x,0)  =  SMSEP[7r(a:)]  where  D  is  a  given  design.  To 
evaluate  td(x,0)  for  a  given  a  design,  an  estimate  of  td(x,I3)  can  be  obtained  by 
replacing  (3  with  (3.  Otherwise,  some  parameter  space  concerning  /3,  denoted  by  C, 
must  be  considered.  The  QDGs  can  be  constructed  for  the  standardized  prediction 
variance  function  and  the  SMSEP  as  illustrated  in  the  following  example. 

4.5    A  Numerical  Example 

Consider  again  (see  chapter  3,  section  3.5)  the  cytotoxicity  study  described 
in  Carter  et  al.  (1986,  sec.  2.5).  The  investigation  involved  two  control  variables 
and  interest  was  in  modeling  n(x),  the  probability  of  death  of  the  cells.   The 
experimental  design,  a  42  factorial  design,  and  response  values  yu  are  given  in  Table 
4.2,  and  the  parameter  estimates  for  a  complete  second-order  model  for  the  linear 
predictor  are  given  in  Table  4.3.  The  generalized  linear  models  procedure  (PROC 
GENMOD)  in  SAS  (1997)  was  used  to  fit  the  model. 

The  results  of  the  experiment  were  utilized  as  preliminary  information  for  the 
evaluation  and  comparison  of  designs  by  using  the  estimates  as  initial  guesses  for  the 
parameter  values.  The  subset,  C,  of  the  parameter  space  was  obtained  by  selecting 
3  points  for  each  parameter,  namely,  the  estimate  and  the  estimate  plus/minus  two 
standard  errors. 

It  would  be  of  interest  to  evaluate  and  compare  the  prediction  capabilities 
of  two  second-order  designs,  namely  a  32  factorial  design  and  a  central  composite 
design.  These  designs  can  be  compared  against  a  D-optimal  design  which  will  need 
to  be  obtained  through  sequential  generation. 
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Table  4.2:  Experimental  design  and  response  values 


*1 

X2 

Number  of 

Total  Number 

MMS/10 

PMA 

Dead  Cells  ru 

of  Cells  mu 

Vu  —  1m- 

0 

0 

19 

98 

0.19 

0 

1 

24 

87 

0.28 

0 

10 

54 

91 

0.59 

0 

100 

68 

87 

0.78 

1 

0 

16 

91 

0.18 

10 

0 

17 

90 

0.19 

25 

0 

19 

92 

0.21 

1 

1 

19 

94 

0.20 

1 

10 

41 

90 

0.46 

1 

100 

79 

93 

0.85 

10 

1 

10 

83 

0.12 

10 

10 

36 

85 

0.42 

10 

100 

62 

83 

0.75 

25 

1 

36 

92 

0.39 

25 

10 

56 

93 

0.60 

25 

100 

74 

87 

0.85 

1436 


Table  4.3:  Parameter  Estimates 


Parameter 

Estimate 

Std  Error 

A) 

-1.3430 

0.1232 

A 

-0.0818 

0.0270 

02 

0.1598 

0.0165 

Ai 

0.0039 

0.0010 

#22 

-0.0013 

0.0002 

&2 

-0.0001 

0.0002 
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4.5.1  Sequential  Generation  of  D-optimal  Design 

The  experimental  region  is  rectangular  with  x\  €  [0,25]  and  x2  €  [0, 100]. 
The  sequential  generation  of  a  D-optimal  design  was  performed  using  the  parameter 
estimates  from  the  "initial"  experiment  as  the  initial  parameter  values.  The  initial 
design,  D°,  consisted  of  the  16-point  42  factorial  design  used  by  Carter  et  al.  (1986) 
The  initial  design  can  be  seen  in  plot  A  of  Figure  4.1.  The  sequential  generation 
resulted  in  a  design  with  N  =  375  experimental  runs  and  a  maximum  d(x,£,/3) 
value  equal  to  6.043.  This  design,  however,  consists  of  only  36  distinct  points  (see 
plot  B  of  Figure  4.1).  Notice  that  the  points  tend  to  fall  into  clusters  or  groups 
which  suggests  that  a  reduction  of  the  points  can  be  obtained.  A  cluster  analysis 
was  performed  using  the  S-Plus  language  (Venables  and  Ripley  1994,  sec.   12.2) 
which  reduced  the  number  of  distinct  design  points  to  9.  The  resulting  design  had 
N  =  375  experimental  runs  with  a  maximum  d(xX,P)  value  of  6.085  (see  plot 
C  of  Figure  4.1).  In  an  attempt  to  find  the  D-optimal  design  more  precisely,  this 
design  was  used  as  an  initial  design  and  the  sequential  procedure  started  again. 
This  resulted  in  an  additional  33  runs  and  after  clustering  the  resulting  design  had  9 
distinct  design  points  with  408  experimental  runs  and  a  maximum  d(x,  £,  0)  value 
of  6.028.  This  approximate  D-optimal  design  (see  plot  D  of  Figure  4.1)  appears  to 
be  an  adjusted  32  factorial  design. 

4.5.2  Comparison  of  Designs 

For  the  comparison  of  designs,  the  approximate  D-optimal  design  (see  plot 
A  of  Figure  4.2)  along  with  a  32  design  and  CCD  designs  were  considered.  The  32 
design  consisted  of  the  9  points  associated  with  xi  G  0, 12.5, 25  and  x2  G  0, 50, 100 
with  equal  replications  and  405  experimental  runs  (see  plot  B  of  Figure  4.2).  The 
CCD  designs  used  a  coded  axial  value  of  a  =  |  in  order  to  remain  within  the 
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experimental  regions.  The  levels  for  the  CCD  designs  (corresponding  to  the  coded 
levels  of  -1,  -|,  0,  },  1)  were  %x  <E  0,  6.25, 12.5, 18.75,  25  and  x2  G  0,  25,  50,  75, 100. 
Two  CCD  designs  were  used  with  405  experimental  runs.  The  first  CCD  design 
(CCD1)  consisted  of  equal  replication  of  the  9  design  points.   The  second  CCD 
(CCD2)  design  replicated  the  center  point  twice  as  much  as  the  factorial  and  axial 
points  (see  plot  C  of  Figure  4.2).  The  experimental  designs  are  given  in  Table  4.4. 
The  QDGs  can  be  graphed  for  both  the  standardized  prediction  variance 
and  the  SMSEP.  First,  the  performance  of  the  two  CCD  designs  are  compared 
(see  plots  of  Figure  4.3).  It  can  be  seen  that  on  the  boundary  of  the  experimental 
region  (A  =  1)  the  performance  of  the  two  designs  are  very  similar.  Towards  the 
center  (A  =  0.6)  the  performance  of  design  CCD2  is  slightly  better,  but  overall 
the  performances  of  the  two  designs  are  comparable.  Since  the  CCD2  design  had 
more  replication  at  the  center  it  is  expected  that  it  would  have  better  prediction 
capabilities  near  the  center.  For  the  overall  comparison  of  designs,  we  consider  the 
approximate  D-optimal  design,  32  design,  and  the  CCD1  design.  The  QDGs  for  the 
standardized  prediction  variance  and  SMSEP  for  A  =  1,0.8,  and  0.6  are  given  in 
Figures  4.4,  4.5,  and  4.6.  On  the  boundary  of  the  region,  the  CCD  design  has  the 
largest  dispersion.  The  CCD  design  exhibits  great  variation  in  the  quantiles  and  the 
quantile  values  larger  than  the  median  appear  sensitive  to  the  parameter  values.  The 
magnitudes  of  the  scaled  quantities  are  also  noted  to  be  large  on  the  boundary.  The 
scaling  involves  multiplying  by  ntX)\i-K(x]\  w^ich  can  get  large  for  values  of  7r(a;) 
near  zero  or  one.  Moving  towards  the  center  of  the  region  A  =  0.8  (see  Figure  4.5), 
the  CCD  design  still  has  variation  in  the  quantile  values.  The  D-optimal  design  and 
the  32  design,  however,  have  max  quantiles  for  the  standardized  prediction  variance 
larger  than  the  CCD  design  and  have  greater  sensitivity  to  the  parameter  values. 
The  CCD  design  is  affected  by  the  bias  portion  of  the  SMSEP  and  again  exhibits 
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Table  4.4:  Experimental  Designs 


D-optimal  Design  (N=408) 

xl 

X2 

MMS/10 

PMA 

Design  Weight 

0.0 

1 

55/408 

0.0 

18 

38/408 

0.0 

100 

61/408 

12.5 

4 

22/408 

12.5 

20 

44/408 

12.5 

100 

41/408 

25.0 

0 

54/408 

25.0 

15 

32/408 

25.0 

100 

61/408 

32 

Design  (N=405) 

Xi 

X2 

MMS/10 

PMA 

Design  Weight 

0.0 

0 

1/9 

0.0 

50 

1/9 

0.0 

100 

1/9 

12.5 

0 

1/9 

12.5 

50 

1/9 

12.5 

100 

1/9 

25.0 

0 

1/9 

25.0 

50 

1/9 

25.0 

100 

1/9 

CCD  Designs  (N=405) 

Xi 

X2 

Design  Weight 

MMS/10 

PMA 

CCD1  CCD2 

0.00 

0 

1/9    1/10 

0.00 

100 

1/9    1/10 

25.00 

0 

1/9    1/10 

25.00 

100 

1/9    1/10 

6.25 

50 

1/9     1/10 

18.75 

50 

1/9    1/10 

12.50 

25 

1/9     1/10 

12.50 

75 

1/9     1/10 

12.50 

50 

1/9    2/10 
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large  dispersion  in  the  QDG  of  SMSEP.  Near  the  center  A  =  0.6  (see  Figure  4.6),  the 
D-optimal  design  shows  great  variation  and  sensitivity  to  parameter  values  for  both 
quantiles  of  the  standardized  prediction  variance  and  SMSEP.  The  32  design  and 
CCD  design  have  much  better  looking  QDGs.  The  CCD  design  is  slightly  better 
than  the  32  design  for  the  standardized  prediction  variance,  whereas,  the  32  design  is 
slightly  better  for  the  SMSEP.  Overall,  the  D-optimal  design  seems  to  perform  well 
on  the  boundary  and  poorly  near  the  center.  The  CCD,  however,  does  well  near  the 
center  of  the  region  and  poorly  on  the  boundary  and  is  affected  by  the  prediction 
bias.  The  32  design  appears  to  be  a  nice  compromise  in  this  situation,  in  fact,  this  is 
a  CCD  design  with  coded  axial  runs  at  a  =  1. 

4.6     Conclusions 
This  chapter  investigated  the  use  of  QDGs  as  a  graphical  tool  for  the  analysis 
of  the  prediction  capabilities  of  classical  RSM  designs  in  non-standard  situations. 
In  order  to  compare  the  designs  to  optimal  designs  a  discussion  of  the  sequential 
generation  of  D-optimal  designs  was  presented.  The  sequential  generation  of  designs 
involved  an  application  of  the  Extended  Equivalence  Theorem.   The  QDGs  are 
extended  to  the  standardized  prediction  variance  and  the  scaled  mean-squared  error 
of  prediction.  For  evaluating  and  comparing  classical  RSM  designs,  the  QDGs  allow 
for  an  assessment  of  parameter  dependence  caused  by  the  non-standard  conditions. 
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CHAPTER  5 
MODELING  FUNCTIONS  OF  MULTIPLE  RESPONSES 

5.1     Introduction 

As  previously  mentioned  in  chapter  2,  response  surface  methodology  was 
developed  around  the  traditional  assumptions  of  a  linear  model  and  a  normally 
distributed  response  variable.  However,  in  a  multiresponse  experiment,  a  number 
of  responses  are  observed  and  the  variable  of  interest  to  an  experimenter  may  be 
calculated  from  the  responses.  For  example,  Hunter  and  Jaworski  (1986)  note  that 
in  some  chemical  experiments,  the  yield  is  obtained  by  multiplying  the  observed 
values  of  conversion  and  selectivity,  while  in  other  situations  the  profit  is  often 
computed  from  several  observed  responses.  Also,  as  mentioned  in  chapter  2,  in  the 
semiconductor  industry,  the  experimenter  is  often  interested  in  modeling  the  ratio  of 
responses.  These  experimental  situations  present  a  challenging  problem  of  modeling 
a  function  of  multiple  responses. 

In  Section  5.2  we  review  "naive"  approaches  for  modeling  a  function  of 
multiple  responses.  In  Section  5.3  we  present  a  modeling  procedure  that  addresses 
the  multivariate  nature  of  the  data.  Section  5.4  consists  of  numerical  examples.  This 
is  followed  by  a  summary  of  recommendations  and  conclusions  in  Section  5.5. 

5.2     "Naive"  Modeling  Procedures 
For  the  modeling  of  a  function  of  multiple  responses  there  are  two  standard 
methods  that  have  been  utilized  in  the  model  building  process.  In  this  section,  we 
explain  and  critique  these  two  methods. 
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The  first  method  reduces  the  problem  to  a  single  variable  situation  in  which 
standard  modeling  techniques  can  be  applied.  The  procedure  consists  of  modeling 
the  calculated  response  as  a  polynomial  function  of  the  control  variables.  For  each 
experimental  run  the  function  of  interest  is  applied  to  observed  responses  reducing 
the  data  for  the  run  into  a  single  response  variable.  This  method  does  not  recognize 
the  multivariate  nature  of  the  data  and,  as  a  result,  it  does  not  take  into  account 
the  correlation  structure  among  the  responses.  The  distribution  of  the  computed 
response  may  also  violate  the  standard  assumptions  for  linear  modeling.  If,  for 
example,  the  computed  function  involves  a  sum  of  responses,  the  distributional 
assumptions  may  be  feasible.  On  the  other  hand,  if  the  computed  function  consists 
of  a  complex  function  such  as  a  product  or  ratio  of  responses,  there  could  be  serious 
violations  of  the  distributional  assumptions.  Avoiding  the  multivariate  nature  of 
the  data  and  possible  violations  of  distributional  assumptions  can  result  in  a  less 
efficient  approach  for  the  modeling  of  the  function  of  interest. 

The  second  method  was  proposed  by  Hunter  and  Jaworski  (1986)  as  an 
alternative  approach  to  modeling  the  computed  response  variable  of  interest.  The 
procedure  consists  of  fitting  individual  models  to  the  responses  and  using  the 
function  of  interest  on  the  fitted  models  to  construct  a  model  for  the  response  of 
interest.   Hunter  and  Jaworski  (1986,  p.   321)  suggest  that  their  method  results 
in  flexible  models  and  greater  insight  into  the  process  under  investigation.  This 
procedure,  while  utilizing  the  individual  responses,  fails  to  address  the  multivariate 
nature  of  the  data  in  the  fitting  of  the  models.  Furthermore,  this  procedure  does 
not  take  into  account  any  information  from  the  structure  of  the  function  of  interest. 
The  two  procedures  can  be  visualized  in  Table  5.1.  The  first  procedure  consists  of 
working  with  the  rows  first,  by  computing  the  response  of  interest  for  each  run  and 
then  modeling  the  values  in  the  last  column.  The  method  proposed  by  Hunter  and 


Table  5.1:  Modeling  a  Function  of  Multiple  Responses 
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Individual  Obs. 

on  r  Responses 

Response  of  Interest 

Run 

2/i 

2/2      •• 

2/i      ••• 

yr 

G(2/i,2/2,...  ,2/r) 

1 

2/u 

2/12 

2/ii      •  ■  • 

2/lr 

C?(2/ll,  J/12,  •  •  •   ,2/lr) 

2 

2/21 

2/22        •  •  • 

2/2f       •  •  • 

2/2r 

G{y2l,y22,---   ,2/2r) 

u 

2/ul 

2/u2        •  •  • 

2/ut 

2/ur 

G(2/ul,2/u2,---   ,2/ur) 

n 

Vm 

2/JV2      •  •  • 

ym    ■■• 

2/ATr 

G(yNi,yN2,.--  ,2/JVr) 

model 

model 

functions 

2/i 

2/2      ••• 

fji      ■■■ 

2/r 

function 

Source:  Hunter  and  Jaworski  (1986,  p.  325) 


Jaworski,  however,  consists  of  working  with  the  columns  of  the  individual  responses 
and  then  combining  the  fitted  models  in  the  last  row  to  obtain  a  fitted  model  for  the 
response  of  interest. 

These  two  methods  lack  foundations  which  would  motivate  the  procedures. 
They  consist  of  ad-hoc  and  naive  approaches  to  the  problem.   Both  approaches 
ignore  the  multivariate  nature  of  the  data  in  the  experimental  situations  and  are 
therefore  not  taking  into  account  any  relationships  that  may  exist  among  the 
individual  response  variables. 


5.3     Multiresponse  Analysis  and  Proposed  Modeling  Procedure 
For  an  experimental  situation  involving  a  response  of  interest  that  is 
computed  from  individual  responses,  it  is  important  to  consider  the  multivariate 
nature  of  the  data.  In  this  section,  we  review  the  analysis  of  the  linear  multiresponse 
model  and  propose  a  new  procedure  for  the  modeling  of  a  function  of  responses  that 
addresses  the  multivariate  nature  of  the  situation. 
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Suppose  that  there  are  r  responses,  namely,  yi,y2,. . .  ,Vr  each  of  which  is 
believed  to  depend  upon  a  set  of  k  control  variables  denoted  by  xi,x2,. . .  ,xk  and 
the  experimenter  is  interested  in  modeling  a  function  of  the  responses,  namely, 
G{y\,y2,...  ,yr)-   Khuri  and  Cornell  (1996,  chapter  7)  present  a  discussion  of 
methods  for  the  analysis  of  multiresponse  experiments.  We  summarize  here  the 
estimation  of  a  linear  multiresponse  model. 

Let  us  assume  that  the  individual  response  variables  can  be  modeled  using 
polynomial  regression  models  in  the  Zj's.  The  model  for  the  ith  response  is  then  of 
the  form 

yi  =  fi(x)(3i  +  ei,    z  =  l,2,...,r  (5.1) 

where  /,-(*)  is  a  vector  of  order  pi  x  1  whose  elements  consist  of  powers  and  products 
of  powers  of  the  elements  of  x  =  (xi,x2, . . .  ,  xk)',  and  0t  is  a  vector  of  p{  unknown 
constant  coefficients.  From  an  experiment  consisting  of  N  runs,  the  following  model 
for  the  ith  response  can  be  obtained  from  (5.1) 

Vi  =  Xfii  +  a  i  =  l,2,...,r  (5.2) 

where  y{  and  64  are  N  x  1  vectors  of  observations  and  random  errors  for  the  ith. 
response  and  X{  is  an  N  x  p{  matrix  whose  uth  row  is  /-(aju)    u  =  1, 2, . . .  ,  N.  As 
a  final  step  in  forming  the  linear  multiresponse  model,  the  models  in  (5.2)  can  be 
combined  to  construct  a  multiresponse  model  of  the  form 

y  =  Xp  +  e  (5.3) 

where  y  =  [y[  :  y'2  :    . .  :  y'r]>,  0  B  {#  :  fa  :  . . .  :  #.]',  e  =  [e[  :  e^  :  . . .  :  <]',  and 
X  is  the  block-diagonal  matrix,  diag(Xl5  X2, . . .  ,  Xr).  The  assumptions  about  the 
random  errors  are:  E(e«)  =  0,  Var(Ci)  =  auIN,  and  Cov(ei,  e^)  =  ay IN.  Thus, 
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random  errors  from  the  same  experimental  run  are  assumed  to  be  correlated,  but 
random  errors  from  different  experimental  runs  are  uncorrelated.   Let  £  denote 
the  r  x  r  matrix  whose  (i,  j)th  element  is  a^    (i,j  =  1,2,...  ,  r),  which  is  the 
variance-covariance  matrix  for  the  r  responses.    It  follows  that  e  in  (5.3)  has  a 
variance-covariance  matrix  of  the  form 

Var(e)  =  A  =  £  ®  IN.  (5.4) 

It  follows  that  the  best  linear  unbiased  estimator  of  0  is  given  by 

£  =  {X'A-'X^X'A-'y  (5.5) 

with  a  variance-covariance  matrix  equal  to  (X'A_1X)_1.   Since  E  is  typically 
unknown,  an  estimate  of  it  needs  to  be  used  in  (5.5).  A  consistent  estimator  of  E 
was  proposed  by  Zellner  (1962)  which  uses  the  residual  vectors  from  an  ordinary 
least-squares  fit  to  the  single-response  models  in  (5.2).  This  estimator  is  given  by 
£  =  (&ij)  where 

ai5  = — * J-   i,j  =  1,2,...  ,r.    (5.6) 

Replacing  E  with  £  in  (5.4)  results  in  A*  =  £  ®  1^.  Using  A*  in  (5.5)  we  obtain 
the  estimator 

F  =  (X'A^Xy'X'A^y.  (5.7) 

This  is  called  the  seemingly  unrelated  regression  estimator  (SURE)  of  f3.  It  is  also 
called  the  estimated  generalized  least-squares  estimator  of  j3.  Khuri  and  Cornell 
(1996,  p.  254)  note  that  if  the  single-response  models  in  (5.2)  have  the  same  Xi 
matrices,  that  is  Xi  =  X0,  then  the  best  linear  unbiased  estimator  of  f3  does  not 
depend  on  E.  In  this  case,  the  best  linear  unbiased  estimator  of  j3{  in  (5.2)  is  the 
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same  as  the  ordinary  least-squares  estimate  obtained  from  individually  fitting  the 
zth  model  (*  =  1, 2, . . .  ,r). 

Let  us  now  consider  a  multivariate  situation  in  which  the  experimenter  has 
interest  in  modeling  a  function  of  the  responses,  for  example,  G(f/i,  2/2)-  Using  linear 
models  for  the  means  of  the  two  responses  results  in: 

£(*)    =    E[yl(x)}  =  f[(x)f3l 
H2(x)    =    E[y2(x)}  =  f2(x)(32. 

Observations  within  an  experimental  run  are  correlated  with  a  variance-covariance 
matrix  denoted  by  E.  The  linear  multiresponse  model  estimation  described  above 
can  be  utilized  for  the  estimation  of  filt  (32,  and  S.  It  follows  that  the  fitted  models 
are: 

We  now  propose  a  procedure  for  modeling  a  function  of  multiple  responses 
that  recognizes  the  correlation  among  the  responses  and  takes  into  account  the 
structure  of  the  function  of  interest.  The  proposed  procedure  involves  approximating 
the  function  of  interest  by  using  a  Taylor's  series  approximation  about  the  means  of 
the  individual  responses.  We  assume  that  G(yi,  y2)  is  continuous  and  has  continuous 
partial  derivatives  of  order  <  3  in  a  neighborhood  of  (/ii,/i2).  Using  a  first-order 
approximation  olG(yi,y2)  in  a  neighborhood  of  (/Zi,/i2)  results  in 


^  =  G(pltfH)  + 


GifHyfh).  (5.8) 


This  approximation  can  be  used  to  develop  approximate  expressions  for  the  expected 
value  and  variance  of  G(yi,y2).  The  expected  value  expression  resulting  from  the 


first-order  approximation  is 
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E[G(yi,y2)]  «G(^i,/i2). 


(5.9) 


The  variance  expression  is  approximately  given  by 


Var  [(7(2/1,2/2)]    ~    crn 


dGQjkuHa) 


1 2 


+2al2 


dyi 
dG(fiUfi2) 
dyi 


022 


dG{nun2) 


dy2 
dy2 


(5.10) 


Let  us  denote  the  approximate  standard  deviation  of  G(yi,  y2),  given  as  the  square 
root  of  the  right-hand  side  of  (5.10),  by  t(x,(31,/32,  £).  The  expressions  in  (5.9)  and 
(5.10)  depend  upon  the  unknown  means  and  variance-covariance  of  the  responses. 
The  first-order  prediction  of  G(yi,y2)  uses  the  results  of  the  analysis  of  the  linear 
multiresponse  model.  That  is,  the  predicted  value  of  G(yi,  y2)  at  a  point  x  is  equal 
to 

G^(x)  =  G(fi1(x),fi2(x)). 

In  order  to  assess  the  quality  of  prediction  evaluated  over  all  the  design  points,  we 
consider  using  scaled  and  unsealed  sum  of  absolute  deviations.  The  scaled  sum  of 
absolute  deviations  is  defined  as 

5(i)         v^  \G(yui,yU2)  -Gw{xu)\ 

where  yui  denotes  the  uth  value  of  the  ith  response  (i  =  1,2,  . . .  ,  r;  u  =  1,2,  . . .  , 
N).  The  unsealed  sum  of  absolute  deviations  is 

N 


T(l)    =    ElGW,**)-^*.)!. 


«=i 


Using  a  second-order  approximation  of  G(yi,y2)  in  a  neighborhood  of  (fluffy) 
results  in 
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z™    =    G(/xi,/i2)  + 


1 
+  2 


(j/i  -A»i)q—  +  (2/2-/^2) 7T 
dyi  dy2 

d2 


G(/xi,/i2) 
a2 


(2/i  -  Mi)3^j  +  2(2/i  "  *>(*  -  M*)^  +  (2/2  -  /,2)2|| 


G(Mi»Ma)- 

(5.11) 


This  second-order  approximation  leads  to  the  approximate  expressions  for  the 
expected  value  and  variance  of  Cr(jfr,  jfc)  given  in  (5.12)  and  (5.13). 


E  [£(3/1,2/2)]    «    G(ni,/i2) 


+- 


(Til 


dPG(iii,fi2) 
dy\ 


+  2cr 


12 


(PGinuto) 
dy\dy2 


+  022 


d2G{/i1,fx2) 


dvl 


(5.12) 


Using  properties  of  the  moments  for  the  multivariate  normal  distribution  (Anderson 
1984,  p.    49),  the  variance  and  covariance  terms  in  the  second-order  variance 
approximation  can  be  simplified. 
RESULT  1: 

Var  [(»-  ^f]    =    E[(y,-*)4]-[E[(*-*)8]]a 

as     3(<Tij)     -  (<T«) 
=     2(flr«)2,t=l,2 

RESULT  2: 

Varied -/n)(»- Mi)]    =    E[(|ft-Mi)2(y2-Ma)2]-lE[(»i-MiK»-M2)]]a 

=     (^n)(a22)  +  2(ai2)2-(a12)2 

=      Oil  022  +  (CT12) 
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Var[G(yi,yj)]    «    <ru 


dyi 

+Var(2/!  -  /xi)2 


+  C22 


dGJtiufMk) 


+  2<712 


'dG(ni,H2) 
dyi 


.      9y2      . 


192G(//1)M2) 
2 


^2 


+ 


Var(t/2  -  M2)2 


l$>G(fH,th) 

2      a?/2 


d2G(ni,fi2) 


dyidy2 
dG(ni,fj,2) 
dyi 


+V«r{fo .-  |«i)(«a  -  /*»)] 

+2Cov[(2/1-/i1),(?/1-/ii)2] 
+2Cov  [(2/1  -  a*i),  (2/1  -  MiKlfe  -  Ma)] 
+2Cov  [(2/1  -  fix),  (y2  -  /x2)2] 
+2Cov  [(3/2-M2),  (2/1  -/ii)2] 
+2Cov  [(y2  -  Ma),  (yi  -  Mi) (2/2  -  M2)] 
+2Cov  [(y2  -  Ma),  (2/2  -  M2)2] 
+2Cov  [(?/i  -  /xO2,  (yi  -  nx){y2  -  M2)] 
+2Cov  [(ift  -  /iO2,  (ya  -  M2)2] 


dG(nx,H2) 
dyi 


ia2G(/xi,M2) 
2        ay2 

d2G(fjll,^) 


dG(fj,x,  n2) 

dyx 
dG{nx,n2) 

dy2 


<9G(/xi,//2) 
dy2 


dyxdy2 
ld2G(fix,fi2y 
2        at/2 

2        ay2 

d2G(nx,V2) 


dG(fii,H2) 
dy2 


dyxdy2 
ld2G(fix,ti2y 
2 


dy22 


ld2G(tix,»2) 
2 


dy\ 


ld2G(fix,»2) 
2 


+2C0V  [(?/i-  /Xi)(t/2-M2),(2/2-M2)2]      - 


d2G{nx,V2) 
dyxdy2 


ia2G(^,/i2) 
2        at/; 


^Q(Mi»Ma) 

dyi%2 


2      ay2 


(5.13) 
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RESULT  3: 

Cov  [(yt  -  m),  (yi  -  m)2]    =    E  [(in  -  Hi){Vi  -  m)2]  -  e  [(yi  -  m)]  e  [(Vi  -  m)2] 

=  e  [dm  -  ^)3] 

=    0,  i  =  l,2 

RESULT  4: 

Cov[(yi- Hi),(yi- Hi)(yj- Hj)]    =    E  [(//,■-//.,■)(;</,■-  //,)(<7,  -  /-',)] 

-E  [(j/i  -  Hi)}  E  [(tfj  -  Hi)(Vj  ~  lh)] 
=    E  [(yi  -  Lii)2(yj  ~  fij)] 
=    0 

RESULT  5: 

Cov  [{yi  -  fit),  {yj  -  Hj)2]     =    E  [(y{  -  (i^fa  -  Hj)2]  -  E  [(#  -  ^)]  E  [(yrf  -  fij)2] 

=    E  [(%  -  tM){Vj  -  t*j)*] 

=    0 

RESULT  6: 

Cm  [(»  -  m)%,  (tfi  -  /*)(«*  -  pj]     =    E[(yi- iii)2{yi- Hi){yj-Vj)] 

-E  [(yi  -  Hi)2]  E  [(y{  -  Hi)(yj  -  fij)] 
=    E  [(yt  -  mf(yj  -  fij)]  -  (JaOij 
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RESULT  7: 


Cov  [(yi  -  Hi)\  {vi  -  h)2}  =  E  [to  -  Vifivj  -  m)2]  -  E  Vvt  -  m)2] E  [(vi  -  ih)' 

=    E  [(yt  -  Hi)2(Vj  ~  Vj)2}  -  VuVjj 


=    oaOjj  +  ((Jij)   4-  (oij)    -  Ono 


33 


=    2(<ry)s 


Using  Results  1-7,  the  second-order  variance  approximation  in  (5.13)  can  be 
rewritten  as  follows: 

2 


Var[G(yi,2fc)]    ~    crn 


dG  (1*1,1*2) 


022 


+2(an)5 


dyi 

2        dy\ 


dG(ni,fi2) 
dy2 

2 

+  2(a22)2 


+  2(7 


12 


dG  (1*1,1*2) 


l&GfjiuiH) 

2 


0j/i 
2 


dG  (1*1,1*2)' 
9y2 


dyl 


+  (^11^22  +  (<7l2)2) 

+2(2ana12) 
+2(2(<r12)2) 
+2(2cr22cri2) 


'dPG(f*i,(*2) 
dyidy2 


2       9^ 

2        9y2 
d2G(f*1,L*2) 


dyidy2 


'&G  (1*1,1*2) 
dy\dy2 

2        0i£ 

2        ^2 


(5.14) 


The  square  root  of  the  right-hand  side  of  (5.14)  is  denoted  by  ^(x,/31,/32,  E). 
The  second-order  prediction  of  G(yi,y2)  is  obtained  by  using  the  results  from  a 
multiresponse  analysis  in  (5.12)  and  (5.14).  The  second-order  predicted  value  for 
the  function  of  interest  is  equal  to 

&2\x)     m     (?(&(*),  &(*)) 


1 

+  2 


1      gg^iCg^g))  ffGfote),  £,(»))   ,   .    d2G(f*1(x),i*2(x)y 

°11 5T5 1"  2(Ti2 5 — 5 h  CT22 


atf 


dyidy2 


dyl 
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In  order  to  assess  the  quality  of  prediction,  we  consider  again  using  both  scaled  and 
unsealed  sum  of  absolute  deviations.  The  scaled  sum  of  absolute  deviations  is 

5(2)         y^  \G(yui,yV2)  -  &2){xu)\ 
j*j-        6(xui  &,&,$) 

The  unsealed  sum  of  absolute  deviations  is 

N 

T™    =    2lO(Hta,lfca)-dW(«.)|. 

8=1 

The  proposed  method  based  upon  the  Taylor's  series  approximation  to  the 
function  of  interest  takes  into  account  both  the  multivariate  nature  of  the  data  and 
the  complexity  of  the  function. 

5.4     Numerical  Examples 
In  this  section,  four  examples  illustrating  the  proposed  procedure  for 
modeling  a  function  of  multiple  responses  are  discussed.  The  first  three  examples 
involve  the  product  of  two  responses,  G(yi,y2)  =  yiy2,  while  the  last  example 
involves  the  modeling  of  the  ratio  of  two  responses,  G{y\,  y2)  =  Ul- 

Example  1 

Hunter  and  Jaworski  (1986)  investigate  a  chemical  experiment  where  the 
yield  for  each  run  is  computed  as  the  product  of  conversion  yi  and  selectivity  y2. 
The  response  of  interest  is  therefore  G(yi,y2)  =  Ifctfc.  The  experiment  used  a  22 
factorial  design  with  one  center  run.  The  two  control  variables  were  temperature  X\ 
and  time  x2.  The  design  settings  and  the  observed  values  for  the  two  responses,  as 
well  as  the  product,  are  given  in  Table  5.2. 

Hunter  and  Jaworski  (1986)  analyzed  this  experiment  using  both  the  standard 
naive  procedure  and  their  proposed  methodology.  The  standard  naive  procedure  for 
modeling  a  function  of  responses  fits  a  linear  model  to  the  last  column  consisting  of 
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Table  5.2:  Chemical  Data  for  Example  1 


x\  x2         i/i         yi    G{yi,y2)  =  yiV2 

-1  -1  0.799  0.502                0.401098 

1  -1  0.203  0.113                0.022939 

-1  1  0.997  0.901                0.898297 

1  1  0.401  0.499                0.200099 

0  0  0.602  0.500 0.301000 

Source:  Hunter  and  Jaworski  (1986,  p.  324) 


the  observed  products.  This  results  in  the  following  fitted  model  for  the  product 

G{yx,y2)  =  0.365  -  0.269a;!  +  0.168x2. 

This  first-order  model  resulted  in  a  sum  of  absolute  residuals  equal  to  0.384.  If  the 
model  is  allowed  to  include  the  interaction  term,  the  fitted  model  obtained  is 

G(yi,y»)  =  0.365  -  0.269a;!  +  0.169:r2  -  O.OSOx!^. 

This  results  in  a  sum  of  absolute  residuals  equal  to  0.127. 

Now,  consider  applying  our  proposed  methodology  based  upon  the  linear 
multiresponse  model  analysis  and  Taylor's  series  approximation  to  the  function 
G(yi,y2)  =  2/i2/2-  Considering  first-order  models  for  both  of  the  response  means 
results  in  the  following  models 

Hi(x)    =    f\{x)01  =  An  +  ftiari  +  fk\x2 

t*i(x)      =      f'2(x)P2  =  A)2  +  fta*J  +  l^22X2. 

Since  the  model  forms  are  the  same  for  the  two  models,  the  multiresponse  estimates 
and  the  ordinary  least  squares  estimates  of  /3X  and  /32  will  be  the  same.  Because  of 
this,  the  predictions  from  the  Hunter  and  Jaworski  method  will  agree  with  Gr>l'(ac), 
the  first-order  prediction,  from  the  proposed  procedure.  The  proposed  procedure, 
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however,  takes  into  account  the  variation  of  G(yi,y2)  in  assessing  the  quality  of 
prediction.  Multiresponse  estimation  resulted  in  the  following  fitted  models  and 
estimate  of  the  variance-matrix 

Ai(aj)    =    0.6004-0.298011  +  0.0990x2 
fa(x)    =    0.50300-0.19775a;i  +  0.19625a;2 
6.40e-07    -1.20e-06 
-1.20e-06       1.07e-05 

The  first-order  approximation  in  (5.8)  for  the  function  G(yi,y2)  =  y\y2  is 

ZW      =     A*lAi2  +  (2/1  —  Ml)M2  +  (2/2  —  A*2)A*1- 

Using  (5.9)  and  (5.10)  results  in  the  following  expressions  for  the  expected  value  and 
variance  of  the  product 

E[G{yuy2)}    M    pi/fe. 


Var[G(|/i,i/2)]    ~    0-11^2  +  cr22A*?  +  2ai2^i/i2 


f*2     A*l 

E 

^2 

W 

The  first-order  prediction  for  the  product  is  therefore  fj,i(x)  £i2(x) .    Table  5.3 
contains  the  results  from  the  first-order  prediction.  The  assessment  of  the  quality 
of  prediction  gives  an  unsealed  sum  of  absolute  deviations  T^  =  0.0088012  and  a 
scaled  sum  of  absolute  deviations  S'1'  =  4.891519.  The  procedure  performs  quite 
well  when  compared  to  the  naive  procedure. 
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Table  5.3:  Example  1:  Results  from  First-Order  Prediction 


x\  X2  2/12/2  predicted  deviation 

~~A  -1  0.401098  0.4032973  -0.0021993 

1  -1  0.022939  0.0221706  0.0007684 

-1  1  0.898297  0.8946678  0.0036292 

1  1  0.200099  0.2013021  -0.0012031 

0  0  0.301000  0.3020012  -0.0010012 


Applying  the  second-order  approximation  from  (5.11)  to  the  function 
G(yi,y2)  results  in  the  following 

z(2)     =     fixfl2  +  (Vi  ~  /*l)/*2  +  (2/2  -  l*»}f*i 
+(2/i  ~  /*i)(ite  ~  to)- 

Using  (5.12)  and  (5.14)  gives  the  following  for  the  expected  value  and  variance  of 
the  product 

E[G{y1,y2)]    «    A*iA*2  +  °\v 

Var[G (2/1,2/2)]    ~    <rul4  +  ff»J*i  +  teuptiit  +  ax\<?n  +  o%3- 

The  second-order  prediction  is  therefore  G^(x)  =  fii(x) ^(x)  +  o"i2.    The 
second-order  prediction  is  a  correction  of  the  first-order  prediction  by  adding  on 
the  estimated  covariance  between  the  responses.   Table  5.4  contains  the  results 
from  using  the  second-order  prediction.   The  estimated  covariance  between  the 
responses  is  small.   As  a  result,  the  second-order  predictions  are  close  to  the 
first-order  predictions.  The  assessment  statistics  are  as  follows:  T^  =  0.0088  and 
S(2)  =  4.891642.  The  proposed  procedure  has  performed  quite  well  in  comparison  to 
the  standard  naive  approach. 
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Table  5.4:  Example  1:  Results  from  Second-Order  Prediction 

xi  x2  yiV2  predicted  deviation 

-1  -1  0.401098  0.4032961  -0.0021981 

1  -1  0.022939  0.0221694  0.0007696 

-1  1  0.898297  0.8946666  0.0036304 

1  1  0.200099  0.2013009  -0.0012019 

0  0  0.301000  0.3020000  -0.0010000 

Table  5.5:  Example  2:  Design  Settings  and  Generated  Data 


%1 

X2 

2/i 

2/2 

G(m 

Vt)  =  2/12/2 

-1 

-1 

29.3 

79.3 

2323.49 

1 

-1 

50.3 

10.7 

538.21 

-1 

1 

8.9 

79.2 

704.88 

1 

1 

29.9 

70.3 

2101.97 

0 

0 

28.5 

58.6 

1670.10 

0 

0 

29.4 

59.1 

1737.54 

Example  2 

The  second  example  involves  the  product  of  two  responses  as  well,  however, 
the  two  models  are  slightly  different.   The  fitted  model  for  the  first  response  is 
a  first-order  model,  while  the  fitted  model  for  the  second  response  includes  the 
interaction  term  X\X2  as  well.  The  design  settings  and  generated  data  are  given  in 
Table  5.5. 

The  standard  naive  procedure  resulted  in  the  following  first-order  model  for 
the  observed  product 

■  G{Vl,y2)  =  1512.6983  -  97.0475a;i  -  13.7125x2. 

The  sum  of  absolute  residuals  for  this  model  was  3564.613.  Adding  the  interaction 
term  to  the  fitted  model  reduced  the  sum  of  absolute  residuals  to  764.4867  and 
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Table  5.6:  Example  2:  Results  from  First-Order  Prediction 


£i     %2  V\V2     predicted     deviation 


-1 

-1 

2323.49 

2296.3715 

27.11847 

1 

-1 

538.21 

518.7799 

19.43014 

-1 

1 

704.88 

684.7532 

20.12681 

1 

1 

2101.97 

2076.5965 

25.37347 

0 

0 

1670.10 

1749.2878 

-79.18778 

0 

0 

1737.54 

1749.2878 

-11.74778 

r(i)  = 

=  182.9844 

5(D 

=  4.8225 

resulted  in  the  following  model 

G(yuy2)  =  1512.6983  -  97.0475a;!  -  13.7125x2  +  795.5925x1X2. 

The  analysis  of  the  linear  multiresponse  model  gives  the  following  fitted  models  and 
estimate  of  the  variance-covariance  for  the  responses. 

Ai(x)    =    29.383  +  10.5a;i-10.2x2 

fa(x)  59.533  -  19.375xi  +  14.875x2  +  14.925x^2 

0.1613889    0.1855556 
0.1855556    0.2543056 

Tables  5.6  and  5.7  give  the  results  of  the  first-order  and  second-order  predictions 
from  using  the  analysis  of  the  linear  multiresponse  model  and  the  proposed 
procedure  for  modeling  the  product  of  responses.  The  procedure  again  performs  well 
in  predicting  the  observed  products. 

Example  3 

The  third  example  involves  the  product  of  two  responses  with  the  different 
models,  however,  the  design  settings  do  not  form  an  orthogonal  design.  The  fitted 


E    = 
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Table  5.7:  Example  2:  Results  from  Second-Order  Prediction 


Xi 

X2 

2/12/2 

predicted 

deviation 

-1 

-1 

2323.49 

2296.5571 

26.93292 

1 

-1 

538.21 

518.9654 

19.24458 

-1 

1 

704.88 

684.9387 

19.94125 

1 

1 

2101.97 

2076.7821 

25.18792 

0 

0 

1670.10 

1749.4733 

-79.37333 

0 

0 

1737.54 

1749.4733 

-11.93333 

T&  =  182.6133 
S&  =  4.8121 


Table  5.8:  Example  3:  Design  Settings  and  Generated  Data 


Xi 

^2 

2/i 

2/2 

£(2/1,2/2)  =  2/i2/2 

-1.7 

-1.0 

20.88 

202.81 

4234.673 

1.5 

-0.8 

52.47 

111.83 

5867.720 

-1.0 

0.7 

14.46 

213.68 

3089.813 

0.9 

0.9 

29.08 

162.46 

4724.337 

-1.0 

-1.4 

34.65 

211.22 

7318.773 

1.0 

-0.6 

45.98 

139.76 

6426.165 

-1.3 

1.0 

5.83 

176.24 

1027.479 

1.0 

0.8 

32.18 

181.58 

5843.244 

0.0 

0.0 

29.59 

162.22 

4800.090 

0.0 

0.1 

27.87 

168.42 

4693.865 

-0.2 

0.0 

29.17 

194.56 

5675.315 

model  for  the  first  response  is  again  a  first-order  model  while  the  fitted  model  for 
the  second  response  includes  the  interaction  term  X\X2.  The  design  settings  and 
generated  data  are  given  in  Table  5.8. 

The  standard  naive  procedure  resulted  in  the  following  first-order  model  for 
the  observed  products 

£(2/1,2/2)  =  4908.4565  -  895.5674x!  -  1416.3569z2. 


£    = 
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The  sum  of  absolute  residuals  for  this  model  was  7588.454.  Again  the  interaction 
term  was  added  to  the  fitted  model.  This  reduced  the  sum  of  absolute  residuals  to 
6050.417  and  resulted  in  the  following  model 

G(yi,  y2)  =  4869.957  -  1031.834a?i  -  1169.661x2  +  666.505x1X2. 

The  fitted  models  and  estimate  of  the  variance-covariance  for  the  responses  from  the 
linear  multiresponse  model  analysis  are 

/ii(x)    =    29.7505  +  10.0896xi  -  9.9212x2 

fi2{x)  172.55421  -  19.8040xi  +  10.8298x2  +  15.4828xix2 

1.046311      13.07097 
13.070969    181.08776 

Tables  5.9  and  5.10  give  the  results  of  the  first-order  and  second-order  predictions 
from  using  the  analysis  of  the  linear  multiresponse  model  and  the  proposed 
procedure  for  modeling  the  product  of  responses.  Even  under  the  ill-conditioned 
design  settings,  the  proposed  procedure  performs  well  in  predicting  the  observed 
product  of  responses. 

Example  4 

The  last  example  investigates  the  modeling  of  the  ratio,  jk,  of  two  responses. 
The  nonlinearity  of  the  ratio  function  results  in  a  diffcult  and  interesting  model 
building  problem.   The  example  involves  the  ratio  of  two  responses.   The  first 
response  is  assumed  to  follow  a  second-order  model  without  interaction  in  two 
control  variables,  while  the  second  response  is  assumed  to  follow  a  first-order  model 
including  the  interaction  term.  Table  5.11  presents  the  design  settings  (32  factorial 
design  replicated  twice)  and  generated  data.  Using  a  complete  second-order  model 
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Table  5.9:  Example  3:  Results  from  First-Order  Prediction 


Xi 

x2 

yw2 

predicted 

deviation 

-1.7 

-1.0 

4234.673 

4992.807 

-758.1340 

1.5 

-0.8 

5867.720 

6106.467 

-238.7472 

-1.0 

0.7 

3089.813 

2404.618 

685.1952 

0.9 

0.9 

4724.337 

5293.220 

-568.8835 

-1.0 

-1.4 

7318.773 

6672.277 

646.4959 

1.0 

-0.6 

6426.165 

6271.903 

154.2620 

-1.3 

1.0 

1027.479 

1268.730 

-241.2509 

1.0 

0.8 

5843.244 

5544.777 

298.4669 

0.0 

0.0 

4800.090 

5133.572 

-333.4817 

0.0 

0.1 

4693.865 

4993.522 

-299.6567 

-0.2 

0.0 

5675.315 

4895.214 

780.1015 

T™  =  5004.675 

sw  = 

9.7339 

Table  5.10:  Example  3:  Results  from  Second-Order  Prediction 


X\ 

X2 

2/12/2 

predicted 

deviation 

-1.7 

-1.0 

4234.673 

5005.878 

-771.2050 

1.5 

-0.8 

5867.720 

6119.538 

-251.8182 

-1.0 

0.7 

3089.813 

2417.689 

672.1242 

0.9 

0.9 

4724.337 

5306.291 

-581.9545 

-1.0 

-1.4 

7318.773 

6685.348 

633.4249 

1.0 

-0.6 

6426.165 

6284.974 

141.1910 

-1.3 

1.0 

1027.479 

1281.801 

-254.3218 

1.0 

0.8 

5843.244 

5557.848 

285.3960 

0.0 

0.0 

4800.090 

5146.643 

-346.5527 

0.0 

0.1 

4693.865 

5006.593 

-312.7276 

-0.2 

0.0 

5675.315 

4908.285 

767.0305 

T&  =  ! 

5017.746 

S<2>  = 

9.76233 
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Table  5.11:  Example  4:  Design  Settings  and  Generated  Data 


Xi       %2 

2/1 

2/2 

G(lfc,H»)  =  £ 

-1   -1 

39.3 

67.2 

0.5848214 

1   -1 

66.1 

66.4 

0.9954819 

-1    1 

69.4 

47.1 

1.4734607 

1    1 

88.6 

41.5 

2.1349398 

-1   0 

48.7 

40.3 

1.2084367 

1   0 

66.8 

41.6 

1.6057692 

0 
0 
0 


0 
0 
0 


-1 

1 

0 

-1 

-1 

1 

1 

0 
0 

-1 
1 

0 


66.5 
74.8 
66.5 
44.2 
60.7 
65.8 
88.5 
54.9 
72.9 
47.3 
78.9 
58.7 


102.2 
50.0 
64.0 
77.5 
58.6 
44.2 
43.1 
59.2 
46.4 
35.8 
48.7 
52.6 


0.6506849 
1.4960000 
1.0390625 
0.5703226 
1.0358362 
1.4886878 
2.0533643 
0.9273649 
1.5711207 
1.3212291 
1.6201232 
1.1159696 


to  perform  the  standard  naive  procedure  for  modeling  the  ratio  results  in  the  fitted 
model. 

G(yu  y2)  =  1.1900  +  0.2620xi  +  0.4257x2  +  0.0970^  +  0.0437x1X2  +  0.0408^ 


The  sum  of  absolute  residuals  for  the  fitted  naive  model  was  1.991.   The  fitted 
models  and  estimated  variance-covariance  of  the  responses  from  the  analysis  of  the 
linear  multiresponse  model  are  given  below. 


ft2(x) 


£  = 


62.509  +  10.1083xi  +  11.825x2  +  0.0322^  +  2.754^ 

54.8  -  3.1583X!  -  11.0917x2  +  2.0542xix2 
17.88370   43.86602 
43.86602  162.12519 
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The  first-order  approximation  in  (5.8)  for  the  function  G(yi,y2)  =  {£  is 


in 


s(l)     =     ^l  +  (2/l_//l)_  +  (2/2_//2). 
^2  M2 


"^1 


f4    ' 


Using  (5.9)  and  (5.10)  results  in  the  following  expressions  for  the  expected  value  and 
variance  of  the  ratio 


E[G(</i,2/2)]    ~    - 

^2 


1   N2    ,  ,-Vl\2    ,    n_     /   1   w"IHi 


Var^,^)]    «    a11(-)2  +  a22(^-)2  +  2<r12(-)(^) 

M2  ^2  M2         ^2 


x 

« 


**2       J 


The  first-order  prediction  for  the  ratio  is  therefore  jg-.  Table  5.12  contains  the  results 
from  the  first-order  prediction. 

Applying  the  second-order  approximation  from  (5.11)  to  the  function 
G{yi,V2)  =  £  results  in  the  following 


V2 


iff    =    ^  +  (yi-^-  +  (y2-^)- 

M2  M2 


^2 


+  ; 


1 


1 


2(2/i  -  /ii)(j/2  -  M8)"3  +  (2/2  "  ^2)2^ 


/*2  ^2 

Using  (5.12)  and  (5.14)  gives  the  following  for  the  expected  value  and  variance  of 
the  ratio 


E[G(lft,lfe)]      M      ^+<712(4)+^22(^) 
A*2  ^2  ^2 
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Table  5.12:  Example  4:  Results  from  First-Order  Prediction 


Xi 

x2 

HI 

!/2 

predicted 

deviation 

-1 

-1 

0.5848214 

0.6098392 

-0.025017760 

-1 

0.9954819 

1.0477881 

-0.052306186 

-1 

1 

1.4734607 

1.4953926 

-0.021931849 

1 

2.1349398 

2.0474208 

0.087518976 

-1 

0 

1.2084367 

0.9046658 

0.303770966 

0 

1.6057692 

1.4068017 

0.198967496 

0 

-1 

0.6506849 

0.8110017 

-0.160316812 

0 

1 

1.4960000 

1.7636970 

-0.267697004 

0 

0 

1.0390625 

1.1406755 

-0.101613028 

-1 

-1 

0.5703226 

0.6098392 

-0.039516607 

-1 

1.0358362 

1.0477881 

-0.011951937 

-1 

1 

1.4886878 

1.4953926 

-0.006704788 

1 

2.0533643 

2.0474208 

0.005943486 

-1 

0 

0.9273649 

0.9046658 

0.022699106 

0 

1.5711207 

1.4068017 

0.164318955 

0 

-1 

1.3212291 

0.8110017 

0.510227307 

0 

1 

1.6201232 

1.7636970 

-0.143573801 

0 

0 

1.1159696 

1.1406755 

-0.024705947 

T™  =  2.149 

sw 

=  12.6793 
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Table  5.13:  Example  4:  Results  from  Second-Order  Prediction 


X\ 

%2 

ILL 
m 

predicted 

deviation 

-1 

-1 

0.5848214 

0.6207186 

-0.035897169 

-1 

0.9954819 

1.0820110 

-0.086529024 

-1 

1 

1.4734607 

1.5942769 

-0.120816145 

1 

2.1349398 

2.2061277 

-0.071187963 

-1 

0 

1.2084367 

0.9352695 

0.273167235 

0 

1.6057692 

1.4758762 

0.129893059 

0 

-1 

0.6506849 

0.8311822 

-0.180497283 

0 

1 

1.4960000 

1.8904094 

-0.394409396 

0 

0 

1.0390625 

1.1876500 

-0.148587499 

-1 

-1 

0.5703226 

0.6207186 

-0.050396017 

-1 

1.0358362 

1.0820110 

-0.046174775 

-1 

1 

1.4886878 

1.5942769 

-0.105589085 

1 

2.0533643 

2.2061277 

-0.152763453 

-1 

0 

0.9273649 

0.9352695 

-0.007904624 

0 

1.5711207 

1.4758762 

0.095244518 

0 

-1 

1.3212291 

0.8311822 

0.490046835 

0 

1 

1.6201232 

1.8904094 

-0.270286192 

0 

0 

1.1159696 

1.1876500 

-0.071680417 

J(2) 

=  2.7311 

5(2) 

=  13.7984 

Var[G(yi,ifc)] 

~  (i)" 

(-Pi 

+  ^22     2 

J  +2<712(  — 

/          VM2, 

+2  (<r22) 


2  '^Tjl      +  {<Tn(T22  +  (<Ju)2)   \~2 


+2  (2(T22(7i2)   (  -J 


12^i 
2  (4 


The  second-order  prediction  is  therefore  G^(x)  =  |4§y  +ai2  (^7x) )  +^22  (^(acj)- 
Table  5.13  contains  the  results  from  using  the  second-order  prediction.  The  proposed 
method's  performance  is  comparable  to  the  standard  naive  procedure.  The  standard 
naive  procedure  places  a  second-order  model  on  the  observed  responses,  while 
the  proposed  procedure  results  in  a  ratio  of  polynomial  models  for  the  first-order 
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prediction  and  a  correction  of  this  ratio  of  polynomials  for  the  second-order 
prediction. 

5.5     Summary 
In  this  chapter,  we  have  reviewed  naive  procedures  for  the  modeling  of 
functions  of  multiple  responses  and  proposed  a  new  method  that  incorporates  the 
multivariate  nature  of  the  data.  The  proposed  procedure  is  based  upon  the  analysis 
of  linear  multiresponse  models  and  Taylor's  series  approximation  of  the  function  of 
interest.  The  numerical  examples  presented  indicate  that  the  proposed  procedure's 
performance  is  comparable  to  and,  in  some  instances,  much  better  than  the  standard 
naive  procedure.  The  proposed  procedure  does  not  correct  for  individual  models  for 
the  responses  that  do  a  poor  job  of  fitting  observations.  It  is  therefore  important 
to  have  good  fits  for  the  individual  response  models  when  utilizing  the  proposed 
procedure.  There  is  also  a  concern  about  the  number  of  parameters  to  be  estimated 
and  the  amount  of  data  available.  Exploring  modern  techniques  for  the  estimation 
of  the  variance-covariance  matrix  may  lead  to  further  research  in  this  area.   It 
is  recommended  that  both  the  naive  and  proposed  procedures  for  modeling  the 
function  of  multiple  responses  be  used  in  any  particular  application.  Then,  if  the 
experimental  situation  allows  for  future  observations,  the  capabilities  of  the  resulting 
models  could  be  evaluated  by  predicting  and  observing  future  outcomes. 


CHAPTER  6 
SUMMARY  AND  FUTURE  RESEARCH 

The  research  developed  in  this  dissertation  has  been  concerned  with  the 
development  of  new  techniques  to  address  design  and  modeling  problems  encountered 
with  the  application  of  GLMs  and  multiresponse  analysis  in  RSM.  The  use  of  GLMs 
in  RSM  introduces  great  difficulty  in  the  construction,  evaluation,  and  comparison  of 
experimental  designs.  A  graphical  technique  has  been  introduced  for  the  comparison 
of  designs  for  logistic  regression  models.  In  multiresponse  experiments  the  outcome 
of  interest  may  be  obtained  through  the  computation  of  a  function  of  the  multiple 
responses.   A  modeling  procedure  has  been  proposed  for  such  an  experimental 
situation. 

In  chapter  3,  QDGs  for  the  logistic  regression  model  were  developed. 
Expressions  for  both  prediction  variance  and  prediction  bias  for  GLMs  were 
reviewed.   These  expressions  were  then  applied  to  the  logistic  regression  model 
to  formulate  the  mean-squared  error  of  prediction.  The  use  of  the  QDGs  of  the 
mean-squared  error  of  prediction  provides  a  convenient  technique  for  evaluating 
and  comparing  designs  for  logistic  regression  models.  The  proposed  plots  provide 
information  concerning  the  quality  of  prediction  of  a  design  and  its  sensitivity  to  the 
model's  parameter  values.  In  addition,  the  design's  performance  can  be  evaluated  at 
any  distance  from  the  center  of  the  experimental  region,  and  any  number  of  control 
variables  can  be  considered  in  the  fitted  model.  The  QDGs  are  also  well  suited 
for  sequential  experimentation.  It  is  possible  to  use  the  quantile  plots  to  augment 
an  existing  design  with  additional  points  in  order  to  improve  its  overall  prediction 
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capability.  If  the  experiment  is  performed  in  stages,  the  QDGs  can  make  use  of 
the  updated  information  concerning  the  parameters.  In  chapter  4,  the  QDGs  are 
extended  to  the  scaled  mean-squared  error  of  prediction.  This  extension  allows 
the  use  of  design  measures  in  the  specification  of  the  experimental  design  and  the 
evaluation  of  the  performance  of  standard  RSM  designs. 

Chapter  5  contains  a  proposed  procedure  for  the  modeling  of  functions 
of  multiple  responses.   A  multiresponse  analysis  of  the  data  and  Taylor's  series 
approximation  of  the  function  form  the  foundations  of  the  proposed  method.  As 
a  result,  the  proposed  modeling  technique  incorporates  correlations  among  the 
responses  and  the  nature  of  the  function  of  interest. 

Several  directions  for  extension  and  future  research  arise  from  this  research, 
and  include  the  following: 

1.  It  should  be  possible  to  extend  the  QDGs  to  other  GLMs.  This  includes 
modeling  of  data  from  normal,  Poisson,  and  gamma  distributions.  The 
mean-squared  error  of  prediction  has  been  developed  for  the  general  GLMs  so 
the  extension  should  be  feasible. 

2.  It  should  be  possible  to  modify  QDGs  to  allow  comparison  and  evaluation  of 
experimental  designs  for  estimation  of  the  median  effective  dose  (ED50)  in 
binary  response  experiments. 

3.  There  is  a  need  to  further  explore  the  relationship  between  the  settings  of 
designs  for  GLMs  and  the  unknown  model  parameters.  This  would  require 
multivariable  modeling  of  the  control  variables  as  functions  of  the  unknown 
parameters.  A  well  developed  relationship  of  this  sort  would  be  very  helpful  in 
acquiring  a  better  understanding  of  the  design  dependency  problem. 
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4.  The  development  of  inference  techniques  (for  example,  testing  and  confidence 
intervals)  for  the  proposed  modeling  procedure  for  functions  of  multiple 
responses  needs  to  be  done. 

5.  Optimization  techniques  need  to  be  developed  for  processes  that  are  evaluated 
using  functions  of  multiple  responses. 

6.  The  proposed  modeling  procedure  for  functions  of  responses  assumes 
continuous  random  variables  (typically  normal).  It  should  be  possible,  with  a 
lot  of  work,  to  develop  multiresponse  techniques  that  are  useful  and 
appropriate  in  experimental  situations  involving  responses  from  discrete 
distributions. 


APPENDIX  A 
DISTRIBUTION  OF  THE  RATIO 

Let  yi  and  y2  be  normally  distributed  random  variables  with  expected  value 

Hi  (i  =  1,2),  standard  deviation  <7;  (i  =  1,2)  and  correlation  coefficient  p.   The 

probability  density  function  of  v  =  [£  is 


,,  \  hi         rj^r        h        ,       ,  ,       -/i       ..      J\-  p2        .      -k      , 

V2Tra1a2g3       gy/l  -  p2  g^/1  -  p2         *e\0*!r         41  ~  (?) 

where 


— oo  <  v  <  oo, 

<7U         (7i<72         022 


an         <Tia2         cr22 


h  =  { )—  +  ( )— , 

01  02      01  02  01      02 


2(1  -  p»)    o^ 
and  <&(.)  denotes  the  cumulative  distribution  function  of  the  standard  normal 
random  variable  (Fieller  1932;  Hinkley  1969;  Shanmugalingam  1982). 
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APPENDIX  B 

S-PLUS  CODE:  MODELING  PRODUCT 

CHAPTER  5  EXAMPLE  1 

############## 

#  data  entry  # 
############## 

xl  <-  c(-l, 1,-1, 1,0) 

x2  <-  c(-l,-l, 1,1,0) 

xlx2  <-  xl*x2 

X  <-  cbind(rep(l,5),xl,x2) 

yl  <-  c(79.9,20.3,99.7,40.1,60.2)/100 

y2  <-  c(50.2,11.3,90.1,49.9,50.0)/100 

g.yl.y2  <-  yl*y2 

############################################ 

#  estimation  of  variance-covariance  matrix  # 
############################################ 

xpx  <-  tu)y.*y.x 

XpXinv  <-  solve (XpX) 

beta.l  <-  XpXinv,/„*,/.t(X),/.*'/.yl 

beta. 2  <-  XpXinvy.*y.t(X)y.*y.y2 

mu.l  <-  xy.*y.beta.l 

mu.2  <-  X*/.*y.beta.2 

res.l  <-  yl  -  mu.l 
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res. 2  <-  y2  -  mu.2 

Res  <-  cbind(res.l,  res. 2) 

Sigma  <-  (l/length(yl))*t(Res),/.*,/,Res 

########################## 

#  first-order  prediction  # 
########################## 

mu.l  <-  X*/,*c/.beta.l 

mu.2  <-  X0/.*°/.beta.2 

gl  <-  mu. l*mu.2 

devl  <-  g.yl.y2  -  gl 

Tl  <-  sum(absCdevl)) 

var.func  <-  f unction (Sigma, mu. 1, mu.2) { 
Mu21  <-  cbind(mu.2,mu. 1) 
out  <-  diag(Mu21,/.*%Sigma%*,/.t(Mu21)) 
out} 

tau  <-  sqrt(var . func (Sigma, mu. l,mu.2) ) 

51  <-  sum(abs(devl/tau)) 

########################### 

#  second-order  prediction  # 
########################### 

sigmal2  <-  Sigma[l,2] 

g2  <-  mu.l*mu.2  +  sigmal2 

res2  <-  g.yl.y2  -  g2 

T2  <-  sum(abs(res2)) 

theta  <-  sqrt( var.func (Sigma, mu.l, mu.2)  +  Sigma[l,l] *Sigma[2,2]  +  sigmal2"2) 

52  <-  sum(abs(res2/theta)) 


106 


################## 
#  naive  approach  # 
################## 

##  first-order  model  ## 

xpx  <-  t(xy/.*y.x 

XpXinv  <-  solve (XpX) 

naive. xl.x2. beta  <-  XpXinv%*%t(X),/.*,/.g.yl.y2 

naive. xl.x2. beta 

cbind(g.yl.y2,  X,/.*,/.naive.xl.x2.beta,  g.yl.y2-X,/.*'/.naive.xl  .x2.beta) 

sum(abs(g.yl.y2  -  X'/,*y„naive.xl.x2.beta)) 

##  second-order  model  with  interaction  ## 

X  <-  cbind(rep(l,5),xl,x2,xl*x2) 

xpx  <-  t(x)y.*y.x 

XpXinv  <-  solve (XpX) 

naive. xl.x2. beta  <-  XpXinvy.*%t(X),/.*'/.g.yl.y2 

naive. xl.x2. beta 

cbind (g .  y  1 .  y2 ,  xy.*y.naive .  xl .  x2 .  beta ,  g .  y  1 .  y2-xy.*°/.naive .  xl .  x2 .  beta) 

sum(abs(g.yl.y2  -  Xy,*y,naive.xl.x2.beta)) 


APPENDIX  C 

S-PLUS  CODE:  MODELING  RATIO 

CHAPTER  5  EXAMPLE  4 

############## 

#  data  entry  # 
############## 

xl  <-  c(-l, 1,-1, 1,-1, 1,0, 0,0, -1,1, -1,1, -1,1, 0,0,0) 

x2  <-  c(-l, -1,1, 1,0, 0,-1, 1,0, -1,-1, 1,1, 0,0, -1,1,0) 

XI  <-  cbind(rep(l,18),xl,x2,xl"2,x2"2) 

X2  <-  cbind(rep(l,18),xl,x2,xl*x2) 

xipxi  <-  t(xir/.*y.xi 

XlpXlinv  <-  solve (XlpXl) 

X2pX2  <-  t(X2)y.*y.X2 

X2pX2inv  <-  solve (X2pX2) 

yly2. data. mat  <-  round (matrix (c( 

39.34549,  67.16920, 

66.09804,  66.43879, 

69.41887,  47.11301, 

88.62453,  41.51223, 

48.72734,  40.27700, 

66.75034,  41.62833, 

66.49870,  102.21030, 

74.82207,  50.03018, 

66.45670,  64.02044, 

44.17639,  77.52190, 

60.73645,  58.57490, 

65.83143,  44.16316, 

88.52542,  43.13480, 

54.87236,  59.17590, 
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72.92504,  46.44304, 

47.25150,  35.81516, 

78.90568,  48.66641, 

58.65538,  52.56255) ,byrow=T,nrow=18) , 1) 

yl  <-  yly2.data.mat[,l] 

y2  <-  yly2.data.mat[,2] 

g.yl.y2  <-  yl/y2 

cbind(xl,x2,yl,y2,g.yl.y2) 

################## 

#  naive  approach  # 
################## 

##  second-order  model  ## 

X  <-  cbind(rep(l,18),xl,x2,xl~2,xl*x2,x2-2) 

XpX  <-  t(X)'/.*°/.X 

XpXinv  <-  solve (XpX) 

naive. beta  <-  XpXinv,/.*,/.t(X),/.*'/.g.yl.y2 

naive. beta 

cbind(g.yl.y2,  X'/.*'/.naive.beta,  g.yl.y2-X7.*°/.naive.beta) 

sum(abs(g.yl.y2  -  X%**/.naive.beta)) 

############################################ 

#  estimation  of  variance-covariance  matrix  # 
############################################ 

beta.l  <-  XlpXlinv,/.*,/.t(Xl),/.*y.yl 

beta. 2  <-  X2pX2inv,/.*,/.t(X2),/.*,/.y2 

ylols  <-  Xl'/.+y.beta.l 

y2ols  <-  X2y.*'/.beta.2 
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res.lols  <-  yl  -  ylols 

res.2ols  <-  y2  -  y2ols 

Res  <-  cbind (res.lols,  res.2ols) 

Sigma. z  <-  (l/length(yl))*t(Res)'/.*'/.Res 

z.sigmall   <-  Sigma. z [1,1] 

z.sigmal2  <-  Sigma. z [1,2] 

z.sigma22  <-  Sigma. z [2,2] 

Sigma. z 

########################## 

#  GLS  estimation  of  BETA  # 
########################## 

dstar  <-  kronecker (Sigma. z,  diag(18)) 

dstarinv  <-  solve (dstar) 

Z  <-  blkdiag(list(Xl,X2))  ##  see  below  for  blkdiag  function  ## 

Y  <-  matrix(c(yl,y2) ,ncol=l) 

beta .  z  <-  solve  (t  (Z)  ,/.*,/.dstar inv'/.f'/.Z)  %*%t  (Z)  7.*'/.dstar inv0/.*'/.Y 

betal.z  <-  beta.z[l:ncol(Xl)] 

beta2.z  <-  beta.z[(ncol(Xl)+l) : (ncol(Xl)+ncol(X2))] 

##  blkdiag  function  ## 

blkdiag  _  function(x)  { 

#  x  is  a  list  of  matrices 
sizescol  _  sapply(x.ncol) 
totcol  _  sum(sizescol) 
sizesrow  _  sapply(x.nrow) 
totrow  _  sum(sizesrow) 

z  _  matrix (O,ncol=totcol,nrow=totrow) 
poscol  _  c(l,cumsum(sizescol)+l) 
posrow_  c(l,cumsum(sizesrow)+l) 
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for   (i   in   (1 : length (x)))    { 
epos  _   (poscol[i]) : (poscol[i+l]-l) 
rpos  _    (posrow[i]) : (posrow[i+l]-l) 
z[rpos,cpos]    _  x[[i]] 

} 

z 

} 

########################## 
#  first-order  prediction  # 
########################## 

ylz  <-  Xl7.*'/.betal.z 
y2z  <-  X2'/.*,/.beta2 .  z 
cbind(yl,ylz,yl-ylz,y2,y2z,y2-y2z) 

glz  <-  ylz/y2z 

devlz  <-  g.yl.y2  -  glz 

cbind(g.yl.y2,  glz,  devlz) 

Tl  <-  sum(abs (devlz)) 

tau.func  <-  function(Sigma.z,ylz,y2z){ 

diff.mat  <-  cbind(l/y2z,  -ylz/(y2z~2)) 

out  <-  diag(diff  .mat"/.*,/.Sigma.z,/.*,/.t  (diff.mat)) 

out} 

tau  <-  sqrt (tau.func (Sigma. z, ylz, y2z)) 
SI  <-  sum(abs(devlz/tau)) 

########################### 
#  second-order  prediction  # 
########################### 

g2z  <-  ylz/y2z  +  z.sigmal2*(-l/(y2z~2))  +  z.sigma22*(ylz/(y2z~3)) 

dev2z  <-  g.yl.y2  -  g2z 

T2  <-  sum(abs(dev2z)) 
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theta  <-  sqrt(z.sigmall*(l/y2z~2)  +  z.sigma22*(-ylz/y2z~2)~2  + 
2*z.sigmal2*(l/y2z)*(-ylz/y2z~2)  +  2*z.sigma22"2*(ylz/y2z"3)"2  + 
(z.sigmall*z.sigma22  +  z.sigmal2~2)*(-l/y2z~2)~2  + 
2* (2*z . sigma22*z . sigmal2) * (-l/y2z~2) * (ylz/y2z"3) ) 

S2  <-  sum(abs(dev2z/theta)) 
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